1 |
An analysis of emotion-exchange motifs in multiplex networks during emergency eventsKusen, Ema, Strembeck, Mark January 2019 (has links) (PDF)
In this paper, we present an analysis of the emotion-exchange patterns that arise from
Twitter messages sent during emergency events. To this end, we performed a
systematic structural analysis of the multiplex communication network that we derived
from a data-set including more than 1.9 million tweets that have been sent during five
recent shootings and terror events. In order to study the local communication
structures that emerge as Twitter users directly exchange emotional messages, we
propose the concept of emotion-exchangemotifs. Our findings suggest that
emotion-exchange motifs which contain reciprocal edges (indicating online
conversations) only emerge when users exchange messages that convey anger or fear,
either in isolation or in any combination with another emotion. In contrast, the
expression of sadness, disgust, surprise, as well as any positive emotion are rather
characteristic for emotion-exchange motifs representing one-way communication
patterns (instead of online conversations). Among other things, we also found that a
higher structural similarity exists between pairs of network layers consisting of one
high-arousal emotion and one low-arousal emotion, rather than pairs of network layers
belonging to the same arousal dimension.
2 |
Innovative Algorithms and Evaluation Methods for Biological Motif FindingKim, Wooyoung 05 May 2012 (has links)
Biological motifs are defined as overly recurring sub-patterns in biological systems. Sequence motifs and network motifs are the examples of biological motifs. Due to the wide range of applications, many algorithms and computational tools have been developed for efficient search for biological motifs. Therefore, there are more computationally derived motifs than experimentally validated motifs, and how to validate the biological significance of the ‘candidate motifs’ becomes an important question. Some of sequence motifs are verified by their structural similarities or their functional roles in DNA or protein sequences, and stored in databases. However, biological role of
network motifs is still invalidated and currently no databases exist for this purpose.
In this thesis, we focus not only on the computational efficiency but also on the biological meanings of the motifs. We provide an efficient way to incorporate biological information with clustering analysis methods: For example, a sparse nonnegative matrix factorization (SNMF) method is used with Chou-Fasman parameters for the protein motif finding. Biological network motifs are searched by various clustering algorithms with Gene ontology (GO) information. Experimental results show that the algorithms perform better than existing algorithms by producing a larger number of high-quality of biological motifs.
In addition, we apply biological network motifs for the discovery of essential proteins. Essential proteins are defined as a minimum set of proteins which are vital for development to a fertile adult and in a cellular life in an organism. We design a new centrality algorithm with biological network motifs, named MCGO, and score proteins in a protein-protein interaction (PPI) network to find essential proteins. MCGO is also combined with other centrality measures to predict essential proteins using machine learning techniques.
We have three contributions to the study of biological motifs through this thesis; 1) Clustering analysis is efficiently used in this work and biological information is easily integrated with the analysis; 2) We focus more on the biological meanings of motifs by adding biological knowledge in the algorithms and by suggesting biologically related evaluation methods. 3) Biological network motifs are successfully applied to a practical application of prediction of essential proteins.
3 |
A Mixture-of-Experts Approach for Gene Regulatory Network InferenceShao, Borong January 2014 (has links)
Context. Gene regulatory network (GRN) inference is an important and challenging problem in bioinformatics. A variety of machine learning algorithms have been applied to increase the GRN inference accuracy. Ensemble learning methods are shown to yield a higher inference accuracy than individual algorithms. Objectives. We propose an ensemble GRN inference method, which is based on the principle of Mixture-of-Experts ensemble learning. The proposed method can quantitatively measure the accuracy of individual GRN inference algorithms at the network motifs level. Based on the accuracy of the individual algorithms at predicting different types of network motifs, weights are assigned to the individual algorithms so as to take advantages of their strengths and weaknesses. In this way, we can improve the accuracy of the ensemble prediction. Methods. The research methodology is controlled experiment. The independent variable is method. It has eight groups: five individual algorithms, the generic average ranking method used in the DREAM5 challenge, the proposed ensemble method including four types of network motifs and five types of network motifs. The dependent variable is GRN inference accuracy, measured by the area under the precision-recall curve (AUPR). The experiment has training and testing phases. In the training phase, we analyze the accuracy of five individual algorithms at the network motifs level to decide their weights. In the testing phase, the weights are used to combine predictions from the five individual algorithms to generate ensemble predictions. We compare the accuracy of the eight method groups on Escherichia coli microarray dataset using AUPR. Results. In the training phase, we obtain the AUPR values of the five individual algorithms at predicting each type of the network motifs. In the testing phase, we collect the AUPR values of the eight methods on predicting the GRN of the Escherichia coli microarray dataset. Each method group has a sample size of ten (ten AUPR values). Conclusions. Statistical tests on the experiment results show that the proposed method yields a significantly higher accuracy than the generic average ranking method. In addition, a new type of network motif is found in GRN, the inclusion of which can increase the accuracy of the proposed method significantly. / Genes are DNA molecules that control the biological traits and biochemical processes that comprise life. They interact with each other to realize the precise regulation of life activities. Biologists aim to understand the regulatory network among the genes, with the help of high-throughput techonologies, such as microarrays, RNA-seq, etc. These technologies produce large amount of gene expression data which contain useful information. Therefore, effective data mining is necessary to discover the information to promote biological research. Gene regulatory network (GRN) inference is to infer the gene interactions from gene expression data, such as microarray datasets. The inference results can be used to guide the direction of further experiments to discover or validate gene interactions. A variety of machine learning (data mining) methods have been proposed to solve this problem. In recent years, experiments have shown that ensemble learning methods achieve higher accuracy than the individual learning methods. Because the ensemble learning methods can take advantages of the strength of different individual methods and it is robust to different network structures. In this thesis, we propose an ensemble GRN inference method, which is based on the principle of the Mixture-of-Experts ensemble learning. By quantitatively measure the accuracy of individual methods at the network motifs level, the proposed method is able to take advantage of the complementarity among the individual methods. The proposed method yields a significantly higher accuracy than the generic average ranking method, which is the most accurate method out of 35 GRN inference methods in the DREAM5 challenge. / 0769607980
4 |
Systematic prediction of feedback regulatory network motifsSahoo, Amruta 04 1900 (has links)
Comprendre le câblage complexe de la régulation cellulaire reste un défi des plus redoutables.Les connaissances fondamentales sur le câblage et le fonctionnement du réseau d’homéostasiedes protéines aideront à mieux comprendre comment l’homéostasie des protéines échouedans les maladies et comment les modèles de régulation du réseau d’homéostasie desprotéines peuvent être ciblés pour une intervention thérapeutique. L’étude vise à développeret à appliquer une nouvelle méthodologie de calcul pour l’identification systématique etla caractérisation des systèmes de rétroaction en homéostasie des protéines. La rechercheproposée combine des idées et des approches issues de la science des protéines, de la biologiedes systèmes de levure, de la biologie computationnelle et de la biologie des réseaux.La difficulté dans la tâche d’incorporer des données multi-plateformes multi-omiques estamplifiée par le vaste réseau de gènes, protéines et métabolites interconnectés qui seréunissent pour remplir une fonction spécifique. Pour ma thèse de maîtrise, j’ai développéun algorithme PBPF (Path-Based Pattern Finding), qui recherche et énumère les motifsde réseau de la topologie requise. Il s’agit d’un algorithme basé sur la théorie des graphesqui utilise la combinaison d’une méthode transversale de profondeur et d’une méthodede recherche par largeur ensuite pour identifier les topologies de sous-graphes de réseaurequises. En outre, le fonctionnement de l’algorithme a été démontré dans les domainesde l’homéostasie des protéines chezSaccharomyces cerevisiae. Une approche systématiqued’intégration des données de la biologie des systèmes a été orchestrée, qui montre l’iden-tification systématique de motifs de rétroaction régulatrice connus dans l’homéostasie desprotéines. Il revendique fortement la capacité d’identifier de nouveaux motifs de rétroactionréglementaire envahissants. L’application de l’algorithme peut être étendue à d’autressystèmes biologiques, par exemple, pour identifier des motifs de rétroaction spécifiques àl’état cellulaire dans le cas de cellules souches. / Understanding the intricate wiring of cellular regulation remains a most formidable chal-lenge. The fundamental insights into the wiring and functioning of the protein homeostasisnetwork will help to better understand how protein homeostasis fails in diseases and howthe regulatory patterns of protein homeostasis network can be targeted for therapeuticintervention. The study aims at developing and applying novel computational methodologyfor the systematic identification and characterization of feedback systems in proteinhomeostasis. The proposed research combines ideas and approaches from protein science,yeast systems biology, computational biology, as well as network biology. The difficultyin the task of incorporating multi-platform multi-omics data is amplified by the largenetwork of inter-connected genes, proteins and metabolites that come together to perform aspecific function. For my master’s thesis, I developed a path-based pattern finding (PBPF)algorithm, which searches and enumerates network motifs of required topology. It is a graphtheory based algorithm which utilizes the combination of depth-first transverse method andbreadth-first search method to identify the required network sub-graph topologies. Further,the functioning of the algorithm has been demonstrated in the realms of protein homeostasisinSaccharomyces cerevisiae. A systematic approach of integration of systems biologydata has been orchestrated, which shows the systematic identification of known regulatoryfeedback motifs in protein homeostasis. It claims the unique ability to identify novelpervasive regulatory feedback motifs. The application of the algorithm can be extended toother biological systems, for example, to identify cell-state specific feedback motifs in caseof stem-cells.
Page generated in 0.04 seconds