Global ETD Search

11	A high-performance framework for analyzing massive complex networks Madduri, Kamesh 08 July 2008 (has links) Graphs are a fundamental and widely-used abstraction for representing data. We can analytically study interesting aspects of real-world complex systems such as the Internet, social systems, transportation networks, and biological interaction data by modeling them as graphs. Graph-theoretic and combinatorial problems are also pervasive in scientific computing and engineering applications. In this dissertation, we address the problem of analyzing large-scale complex networks that represent interactions between hundreds of thousands to billions of entities. We present SNAP, a new high-performance computational framework for efficiently processing graph-theoretic queries on massive datasets. Graph analysis is computationally very different from traditional scientific computing, and solving massive graph-theoretic problems on current high performance computing systems is challenging due to several reasons. First, real-world graphs are often characterized by a low diameter and unbalanced degree distributions, and are difficult to partition on parallel systems. Second, parallel algorithms for solving graph-theoretic problems are typically memory intensive, and the memory accesses are fine-grained and highly irregular. The primary contributions of this dissertation are the design and implementation of novel parallel graph algorithms for traversal, shortest paths, and centrality computations, optimized for the small-world network topology, and high-performance multithreaded architectures and multicore servers. SNAP (Small-world Network Analysis and Partitioning) is a modular, open-source framework for the exploratory analysis and partitioning of large-scale networks. With SNAP, we demonstrate the capability to process massive graphs with billions of vertices and edges, and achieve up to two orders of magnitude speedup over state-of-the-art network analysis approaches. We also design a new parallel computing benchmark for characterizing the performance of graph-theoretic problems on high-end systems; study data representations for dynamic graph problems on parallel systems; and apply algorithms in SNAP to solve real-world problems in social network analysis and systems biology. Parallel computing Graph algorithms Multithreaded algorithms Complex networks Graph analysis framework Graph theory Data processing Network analysis (Planning) Combinatorial analysis Graph algorithms
12	Signal Processing on Graphs - Contributions to an Emerging Field / Traitement du signal sur graphes - Contributions à un domaine émergent Girault, Benjamin 01 December 2015 (has links) Ce manuscrit introduit dans une première partie le domaine du traitement du signal sur graphe en commençant par poser les bases d'algèbre linéaire et de théorie spectrale des graphes. Nous définissons ensuite le traitement du signal sur graphe et donnons des intuitions sur ses forces et faiblesses actuelles comparativement au traitement du signal classique. En seconde partie, nous introduisons nos contributions au domaine. Le chapitre 4 cible plus particulièrement l'étude de la structure d'un graphe par l'analyse des signaux temporels via une transformation graphe vers série temporelle. Ce faisant, nous exploitons une approche unifiée d'apprentissage semi-supervisé sur graphe dédiée à la classification pour obtenir une série temporelle lisse. Enfin, nous montrons que cette approche s'apparente à du lissage de signaux sur graphe. Le chapitre 5 de cette partie introduit un nouvel opérateur de translation sur graphe définit par analogie avec l'opérateur classique de translation en temps et vérifiant la propriété clé d'isométrie. Cet opérateur est comparé aux deux opérateurs de la littérature et son action est décrite empiriquement sur quelques graphes clés. Le chapitre 6 décrit l'utilisation de l'opérateur ci-dessus pour définir la notion de signal stationnaire sur graphe. Après avoir étudié la caractérisation spectrale de tels signaux, nous donnons plusieurs outils essentiels pour étudier et tester cette propriété sur des signaux réels. Le dernier chapitre s'attache à décrire la boite à outils \matlab développée et utilisée tout au long de cette thèse. / This dissertation introduces in its first part the field of signal processing on graphs. We start by reminding the required elements from linear algebra and spectral graph theory. Then, we define signal processing on graphs and give intuitions on its strengths and weaknesses compared to classical signal processing. In the second part, we introduce our contributions to the field. Chapter 4 aims at the study of structural properties of graphs using classical signal processing through a transformation from graphs to time series. Doing so, we take advantage of a unified method of semi-supervised learning on graphs dedicated to classification to obtain a smooth time series. Finally, we show that we can recognize in our method a smoothing operator on graph signals. Chapter 5 introduces a new translation operator on graphs defined by analogy to the classical time shift operator and verifying the key property of isometry. Our operator is compared to the two operators of the literature and its action is empirically described on several graphs. Chapter 6 describes the use of the operator above to define stationary graph signals. After giving a spectral characterization of these graph signals, we give a method to study and test stationarity on real graph signals. The closing chapter shows the strength of the matlab toolbox developed and used during the course of this PhD. Traitement du signal sur graphe Translation sur graphe Analyse de graphe Signaux sur graphe aléatoires Stationarité Graph signal processing Graph translation Graph analysis Stochastic graph signals Stationarity
13	Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis Royer, Loic 12 December 2017 (has links) (PDF) Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment. First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e. Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes. Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium. protein interaction networks Power graph analysis proteomics bioinformatics computational biology graph theory visualization network compression Y2H APMS miR-124 HIF-1 MELAS Sjogren syndrome ddc:570 rvk:WD 5100
14	Compile- and run-time approaches for the selection of efficient data structures for dynamic graph analysis Schiller, Benjamin, Deusser, Clemens, Castrillon, Jeronimo, Strufe, Thorsten 11 January 2017 (has links) (PDF) Graphs are used to model a wide range of systems from different disciplines including social network analysis, biology, and big data processing. When analyzing these constantly changing dynamic graphs at a high frequency, performance is the main concern. Depending on the graph size and structure, update frequency, and read accesses of the analysis, the use of different data structures can yield great performance variations. Even for expert programmers, it is not always obvious, which data structure is the best choice for a given scenario. In previous work, we presented an approach for handling the selection of the most efficient data structures automatically using a compile-time approach well-suited for constant workloads. We extend this work with a measurement study of seven data structures and use the results to fit actual cost estimation functions. In addition, we evaluate our approach for the computations of seven different graph metrics. In analyses of real-world dynamic graphs with a constant workload, our approach achieves a speedup of up to 5.4× compared to basic data structure configurations. Such a compile-time based approach cannot yield optimal results when the behavior of the system changes later and the workload becomes non-constant. To close this gap we present a run-time approach which provides live profiling and facilitates automatic exchanges of data structures during execution. We analyze the performance of this approach using an artificial, non-constant workload where our approach achieves speedups of up to 7.3× compared to basic configurations. Dynamische Graphenanalyse Datenstrukturen Performance Messstudie Kompilierzeitoptimierung TU Dresden Publikationsfonds Dynamic graph analysis Data structures Performance Measurement study Compile-time optimization TU Dresden Publishing Fund ddc:300 rvk:MN 1000
15	Functional network centrality in obesity: a resting-state and task fMRI study García-García, Isabel, Jurado, María Ángeles, Garolera, Maite, Marqués-Iturria, Idoia, Horstmann, Annette, Segura, Bàrbara, Pueyo, Roser, Sender-Palacios, María José, Vernet-Vernet, Maria, Villringer, Arno, Junqué, Carme, Margulies, Daniel S., Neumann, Jane January 2015 (has links) Obesity is associated with structural and functional alterations in brain areas that are often functionally distinct and anatomically distant. This suggests that obesity is associated with differences in functional connectivity of regions distributed across the brain. However, studies addressing whole brain functional connectivity in obesity remain scarce. Here, we compared voxel-wise degree centrality and eigenvector centrality between participants with obesity (n=20) and normal-weight controls (n=21). We analyzed resting state and task-related fMRI data acquired from the same individuals. Relative to normal-weight controls, participants with obesity exhibited reduced degree centrality in the right middle frontal gyrus in the resting-state condition. During the task fMRI condition, obese participants exhibited less degree centrality in the left middle frontal gyrus and the lateral occipital cortex along with reduced eigenvector centrality in the lateral occipital cortex and occipital pole. Our results highlight the central role of the middle frontal gyrus in the pathophysiology of obesity, a structure involved in several brain circuits signaling attention, executive functions and motor functions. Additionally, our analysis suggests the existence of task-dependent reduced centrality in occipital areas; regions with a role in perceptual processes and that are profoundly modulated by attention. info:eu-repo/classification/ddc/610 ddc:610
16	Analysis, integration and applications of the human interactome Chaurasia, Gautam 12 December 2012 (has links) Protein-Protein Interaktions (PPI) Netzwerke liefern ein Grundgerüst für systematische Untersuchungen der komplexen molekularen Maschinerie in der Zelle. Die Komplexität von Protein-Wechselwirkungen stellt jedoch in Bezug auf ihre Identifizierung, Validierung und Annotation eine große experimentelle und rechnerische Herausforderung dar. In dieser Arbeit analysierte ich diese Probleme und lieferte Lösungen, um die Limitierungen aktueller humanen PPI Netzwerke zu überwinden. Meine Arbeit kann in zwei Teile aufgeteilt werden: Im ersten Teil führte ich eine kritischen Vergleich von acht unabhängig konstruierten humanen PPI Netzwerke durch, um mögliche experimentellen Verzerrungen zu erkennen. Die Ergebnisse zeigten starke Tendenzen bezüglich der Selektion und Detektion von Interaktionen, die in zukünftigen Anwendungen dieser Netzwerke berücksichtigt werden sollten. Einer der wichtigsten Schlussfolgerungen dieser Studie war, dass die derzeitigen humanen Interaktions Netzwerke komplementär sind und deshalb wurde eine Datenbank mit der Bezeichnung Unified Human Interaktome (UniHI) entwickelt, die menschliche PPI Daten aus zwölf wichtigsten Quellen integriert. Im zweiten Teil dieser Forschungsarbeit benutzte ich die Daten aus der UniHI Datenbank, die genetischen Modifikatoren in einer bestimmten Krankheit, Chorea Huntington (HD) eine autosomal dominante neurodegenerative Erkrankung, zu charakterisieren. Um die Proteine zu identifizieren, die den Krankheitsverlauf modifizieren können, wurden Protein Interaktion Daten mit Genexpressionsdaten von HD-Patienten in Kombination mit einem Mehrschritt-Filterungsverfahren integriert. Mit dem neuartigen Ansatz wurde ein Nucleus caudatus-spezifische Protein-Interaktion HD (PPI)-Netzwerk vorhergesagt, das 14 potentiell dysregulierten Proteine direkt oder indirekt mit dem Huntingtin-Protein verlinkt, mit mögliche Verbindung zu Molekularen Prozessen wie z.B. Apoptose, Metabolismus, neuronale Entwicklung. / Protein interaction networks aim to provide the scaffold maps for systematic studies of the complex molecular machinery in the cell. The complexity of protein interactions poses, however, large experimental and computational challenges regarding their identification, validation and annotation. Additionally, storage and linking is demanding since new data are rapidly accumulating. In this research work, I addressed these issues and provided solutions to overcome the limitations of current human protein-protein interaction (PPI) maps. In particular, my thesis can be partitioned into two parts: In the first part, I conducted a comparative assessment of eight recently constructed human protein-protein interaction networks to identify experimental biases. Results showed strong selection and detection biases which are necessary to take into consideration in future applications of these maps. One of the important conclusions of this study was that the current human interaction networks contain complementary information; hence, a database was developed, termed as Unified Human Interactome (UniHI), integrating human PPI data from twelve major sources. Several new tools were included for querying, analyzing and visualizing human PPI networks. In the second part of this research work, UniHI dataset was applied to characterize the genetic modifiers involved in a specific disease: Chorea Huntington (HD), an autosomal dominant neurodegenerative disease. To find the modifiers, a network-based modeling approach was implemented by integrating huntingtin-specific protein interaction network with gene expression data from HD patients in multiple steps. Using this approach, a Caudate Nucleus-specific HD protein interaction (PPI) network was predicted, connecting 14 potentially dysregulated proteins directly or indirectly to the disease protein, showing a possible link to molecular processes such as pro-apoptotic pathways, cell survival, anti-apoptotic, growth, and neuronal diseases. System Biologie Netzwerk Biologie Protein-protein Wechselwirkung Grpah analyze Huntington-Krankheit Systems Biology Network Biology Protein-protein Interaction Graph Analysis Huntington Disease 570 Biowissenschaften, Biologie 32 Biologie WD 5100 ddc:570
17	Compile- and run-time approaches for the selection of efficient data structures for dynamic graph analysis Schiller, Benjamin, Deusser, Clemens, Castrillon, Jeronimo, Strufe, Thorsten 11 January 2017 (has links) Graphs are used to model a wide range of systems from different disciplines including social network analysis, biology, and big data processing. When analyzing these constantly changing dynamic graphs at a high frequency, performance is the main concern. Depending on the graph size and structure, update frequency, and read accesses of the analysis, the use of different data structures can yield great performance variations. Even for expert programmers, it is not always obvious, which data structure is the best choice for a given scenario. In previous work, we presented an approach for handling the selection of the most efficient data structures automatically using a compile-time approach well-suited for constant workloads. We extend this work with a measurement study of seven data structures and use the results to fit actual cost estimation functions. In addition, we evaluate our approach for the computations of seven different graph metrics. In analyses of real-world dynamic graphs with a constant workload, our approach achieves a speedup of up to 5.4× compared to basic data structure configurations. Such a compile-time based approach cannot yield optimal results when the behavior of the system changes later and the workload becomes non-constant. To close this gap we present a run-time approach which provides live profiling and facilitates automatic exchanges of data structures during execution. We analyze the performance of this approach using an artificial, non-constant workload where our approach achieves speedups of up to 7.3× compared to basic configurations. info:eu-repo/classification/ddc/300 ddc:300
18	Statistické zhodnocení dat / Statistical data evaluation Fadrný, Tomáš January 2009 (has links) This diploma thesis evaluates and processes data from final device checks. All the devices are similar types of thermal overcurrent relays by the ABB company. For appropriate statistical data processing, the Minitab 14 statistical software was used and various statistical methods were applied. Results are always listed for each device type and each method used. The diploma thesis is divided into two parts. The first one analyzes the methods used and the second part states the method results. There is also an overall evaluation of the processed data.
19	Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis Royer, Loic 11 October 2010 (has links) Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment. First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e. Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes. Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium. info:eu-repo/classification/ddc/570 ddc:570
20	[en] A PSYCHOLINGUISTIC INVESTIGATION OF WRITING IN L1 AND L2: A STUDY WITH ENGLISH TEACHERS / [pt] UMA INVESTIGAÇÃO PSICOLINGUÍSTICA DA ESCRITA EM L1 E L2: UM ESTUDO COM PROFESSORES DE INGLÊS RACHEL DA COSTA MURICY 23 November 2023 (has links) [pt] A presente dissertação aborda a escrita bilíngue – Português como L1 e Inglês como L2, a partir de uma perspectiva cognitiva, com vistas a buscar caracterizar, de forma integrada, o processo e o produto da escrita, e possíveis correlações entre desempenho em escrita e aspectos atencionais. Participam da pesquisa 15 professores de língua inglesa (10 mulheres e 5 homens), idade média de 43,5 anos (DP 13,25), nativos do Português brasileiro. No estudo, foram empregadas ferramentas computacionais que possibilitam o registro das ações de escrita no curso da produção textual de textos argumentativos (programa Inputlog), a análise automática de características linguísticas do texto final (Nilc-Metrix (L1) e Coh-Metrix (L2) e a verificação de padrões de conectividade no texto final, por meio de atributos de grafos (SpeechGraphs). Adotou-se também o teste ANT - Attention Network Test com o intuito de ampliar a reflexão a respeito de fatores cognitivos e possíveis influências na produção textual. Na análise do processo de escrita, foram examinados tanto padrões de pausa como operações de escrita ativa e ações de revisão (inserções e apagamentos). Na análise do produto, consideraram-se parâmetros ligados a aspectos vocabulares, semânticos, sintáticos e índices de legibilidade, e informações sobre recorrência lexical e conectividade entre palavras. No que tange ao processo, os resultados do estudo revelaram diferenças entre as duas línguas, com valores mais altos associados à escrita em Inglês, para (i) pausas no interior de palavras - possivelmente sinalizando uma demanda de ordem ortográfica - e (ii) percentual de escrita ininterrupta, indicando uma escrita com menos interrupções, com menor número de alterações/revisões. O estudo de correlação revelou que os participantes apresentam o mesmo perfil de escrita na L1 e na L2. Na análise do produto por meio do Coh-Metrix (Inglês) e Nilc-Metrix (Português), verificou-se, por meio de índice de legibilidade, que os textos apresentam complexidade moderada nas duas línguas. A despeito de diferenças em como as métricas são definidas em cada Programa, os resultados sugerem que os textos em Português apresentam graus de complexidade que se correlacionam com aspectos sintáticos (como número de palavras antes do verbo principal e índice de Flesch) e semânticos (grau de concretude). Na L2, destaca-se que a diversidade lexical permanece sendo um dos indicadores mais confiáveis de proficiência e graus de complexidade, correlacionando-se com comportamentos de pausas (antes de palavras) e revisão (normal production). Em relação ao SpeechGraphs, foram observadas diferenças significativas entre os textos na L1 e na L2 para quase todos os atributos de grafos analisados, o que é interpretado como um reflexo da forma como o programa lida com características morfológicas das duas línguas. Não foram observadas correlações entre o comportamento dos falantes na L1 e na L2. Foram ainda conduzidos estudos de correlação entre os dados do Inputlog e os das ferramentas Coh-Metrix e Nilc-Metrix e entre estas e os dados do SpeechGraphs. Nos dois estudos, observou-se uma correspondência entre parâmetros indicativos de complexidade das ferramentas utilizadas, sugerindo um caminho relevante de exploração de análise integrada processo-produto para trabalhos futuros. Em relação ao estudo de correlação entre dados do Inputlog e do ANT, destacaram-se as correlações entre acurácia e tempo de reação nas condições experimentais e os percentuais de apagamentos. Os presentes achados abrem caminho e trazem contribuições significativas para o campo da psicolinguística no âmbito da pesquisa entre L1 e L2. / [en] This dissertation addresses bilingual writing – Portuguese as L1 and English as L2 – from a cognitive perspective, aiming to characterize both the writing process and the final product in an integrated manner and explore correlations between writing performance and attentional aspects. The research involves 15 English language teachers (10 women and 5 men) with an average age of 43.5 years (SD 13.25), native speakers of Brazilian Portuguese. The study utilized computational tools to record writing actions during the production of argumentative texts (Inputlog program), automatically analyzed linguistic aspects from the text (Nilc-Metrix program for Portuguese and Coh-Metrix for English) and verify connectivity patterns in the final text using graph attributes (SpeechGraphs program). The Attention Network Test (ANT) was also adopted. In the analysis of the writing process, patterns of pauses, active writing operations, and revision actions (insertions and deletions) were examined. In the product analysis, parameters related to vocabulary, semantics, syntax, readability índices, as well as information on lexical recurrence and word connectivity, were considered. Regarding the writing process, the results of the study revealed differences between the two languages, with higher values associated with writing in English, particularly in terms of (i) pauses within words, indicating orthographic demands, and (ii) the percentage of uninterrupted writing, suggesting less interruption and fewer alterations/revisions. Correlation analysis indicated that participants exhibited a similar writing profile in both L1 and L2. In the product analysis using Coh-Metrix (English) and Nilc-Metrix (Portuguese), it was found, through readability índices, that the texts exhibited moderate complexity in both languages. Despite differences in how metrics are defined in each program, the results suggest that texts in Portuguese show a higher level of complexity when considering syntactic aspects (such as the number of words before main verbs) and semantic aspects (concreteness degree). For L2, lexical diversity remains one of the most reliable proficiency indicators, correlating with pause behavior (before words) and revision (normal production). Regarding SpeechGraphs, significant differences were observed between texts in L1 and L2 for almost all analyzed graph attributes, reflecting how the program deals with morphological characteristics of the two languages. No correlations were observed between the behavior of speakers in L1 and L2. Additionally, correlation studies were conducted between Inputlog data and Coh-Metrix and Nilc-Metrix tools, as well as between these tools and Speech Graph data. In both studies, a correspondence was observed between parameters indicative of complexity in the tools used, suggesting a relevant path for exploring integrated process-product analysis in future research. Regarding the correlation study between Inputlog and ANT data, notable correlations emerged between accuracy and reaction time in experimental conditions and percentages of deletions. These findings pave the way for significant contributions to the field of psycholinguistics in the context of research between L1 and L2. [pt] ESCRITA [pt] COH-METRIX [pt] NILC-METRIX [pt] ANALISE DE GRAFOS [pt] KEYSTROKE LOGGING [pt] PROCESSO DE ESCRITA [pt] ESCRITA EM L1 E L2 [pt] ATENCAO [en] WRITING [en] COH-METRIX [en] NILC-METRIX [en] GRAPH ANALYSIS [en] KEYSTROKE LOGGING [en] WRITING PROCESS [en] WRITING IN L1 AND L2 [en] ATTENTION

Search results