Global ETD Search

1	Algorithmes pour la prédiction in silico d'interactions par similarité entre macromolécules biologiques / Similarity-based algorithms for the prediction of interactions between biomolecules Voland, Mathieu 03 April 2017 (has links) Un médicament, ou tout autre petite molécule biologique, agit sur l’organisme via des interactions chimiques qui se produisent avec d’autres macromolécules telles que les protéines qui régissent le fonctionnement des cellules. La détermination de l’ensemble des cibles, c’est à dire de l’ensemble des macromolécules susceptibles de lier une même molécule, est essentielle pour mieux comprendre les mécanismes moléculaires à l’origine des effets d’un médicament. Cette connaissance permettrait en effet de guider la conception d’un composé pour éviter au mieux les effets secondaires indésirables, ou au contraire découvrir de nouvelles applications à des molécules connues. Les avancées de la biologie structurale nous permettent maintenant d’avoir accès à un très grand nombre de structures tridimensionnelles de protéines impliquées dans ces interactions, ce qui motive l’utilisation d’outils in silico (informatique) pour complémenter ou guider les expériences in vitro ou in vivo plus longues et plus chères.La thèse s’inscrit dans le cadre d’une collaboration entre le laboratoire DAVID de l’Université de Versailles-Saint-Quentin, et l’entreprise Bionext SA qui propose une suite logicielle permettant de visualiser et d’étudier les interactions chimiques. Les travaux de recherches ont pour objectif de développer un algorithme permettant, à partir des données structurales des protéines, de déterminer des cibles potentielles pour un composé donné. L’approche choisie consiste à utiliser la connaissance d’une première interaction entre un composé et une protéine afin de rechercher par similarité d’autres protéines pour lesquelles on peut inférer la capacité à se lier avec le même composé. Il s’agit plus précisément de rechercher une similarité locale entre un motif donné, qui est la région permettant à la cible connue de lier le composé, et un ensemble de protéines candidates.Un algorithme a été développé, BioBind, qui utilise un modèle des surfaces des macromolécules issu de la théorie des formes alpha afin de modéliser la surface accessible ainsi qu’une topologie sur cette surface permettant la définition de régions en surface. Afin de traiter le problème de la recherche d’un motif en surface, une heuristique est utilisée consistant à définir des motifs réguliers qui sont une approximation de disques géodésiques et permettant un échantillonnage exhaustif à la surface des macromolécules. Ces régions circulaires sont alors étendues à l’ensemble du motif recherché afin de déterminer une mesure de similarité.Le problème de la prédiction de cibles est ramené à un problème de classification binaire, où il s’agit pour un ensemble de protéines données de déterminer lesquelles sont susceptibles d’interagir avec le composé considéré, par similarité avec la première cible connue. Cette formalisation permet d’étudier les performances de notre approche, ainsi que de la comparer avec d’autres approches sur différents jeux de données. Nous utilisons pour cela deux jeux de données issus de la littérature ainsi qu’un troisième développé spécifiquement pour cette problématique afin d’être plus représentatif des molécules pertinentes du point de vue pharmacologique, c’est-à-dire ayant des propriétés proches des médicaments. Notre approche se compare favorablement sur ces trois jeux de données par rapport à une autre approche de prédiction par similarité, et plus généralement notre analyse confirme que les approches par docking (amarrage) sont moins performantes que les approches par similarité pour le problème de la prédiction de cibles. / The action of a drug, or another small biomolecule, is induced by chemical interactions with other macromolecules such as proteins regulating the cell functions. The determination of the set of targets, the macromolecules that could bind the same small molecule, is essential in order to understand molecular mechanisms responsible for the effects of a drug. Indeed, this knowledge could help the drug design process so as to avoid side effects or to find new applications for known drugs. The advances of structural biology provides us with three-dimensional representations of many proteins involved in these interactions, motivating the use of in silico tools to complement or guide further in vitro or in vivo experiments which are both more expansive and time consuming.This research is conducted as part of a collaboration between the DAVID laboratory of the Versailles-Saint-Quentin University, and Bionext SA which offers a software suite to visualize and analyze chemical interactions between biological molecules. The objective is to design an algorithm to predict these interactions for a given compound, using the structures of potential targets. More precisely, starting from a known interaction between a drug and a protein, a new interaction can be inferred with another sufficiently similar protein. This approach consists in the search of a given pattern, the known binding site, across a collection of macromolecules.An algorithm was implemented, BioBind, which rely on a topological representation of the surface of the macromolecules based on the alpha shapes theory. Our surface representation allows to define a concept of region of any shape on the surface. In order to tackle the search of a given pattern region, a heuristic has been developed, consisting in the definition of regular region which is an approximation of a geodesic disk. This circular shape allows for an exhaustive sampling and fast comparison, and any circular region can then be extended to the actual pattern to provide a similarity evaluation with the query binding site.The target prediction problem is formalized as a binary classification problem, where a set of macromolecules is being separated between those predicted to interact and the others, based on their local similarity with the known target. With this point of view, classic metrics can be used to assess performance, and compare our approach with others. Three datasets were used, two of which were extracted from the literature and the other one was designed specifically for our problem emphasizing the pharmacological relevance of the chosen molecules. Our algorithm proves to be more efficient than another state-of-the-art similarity based approach, and our analysis confirms that docking software are not relevant for our target prediction problem when a first target is known, according to our metric. Formes alpha Surface des macromolécules Algorithmes géométriques Similarité des macromolécules Prédiction d'interactions Biologie structurale Alpha shapes Macromolecular surface Geometric algorithm Macromolecular similarity Interaction prediction Structural biology 005.1
2	Structural Similarity: Applications to Object Recognition and Clustering Curado, Manuel 03 September 2018 (has links) In this thesis, we propose many developments in the context of Structural Similarity. We address both node (local) similarity and graph (global) similarity. Concerning node similarity, we focus on improving the diffusive process leading to compute this similarity (e.g. Commute Times) by means of modifying or rewiring the structure of the graph (Graph Densification), although some advances in Laplacian-based ranking are also included in this document. Graph Densification is a particular case of what we call graph rewiring, i.e. a novel field (similar to image processing) where input graphs are rewired to be better conditioned for the subsequent pattern recognition tasks (e.g. clustering). In the thesis, we contribute with an scalable an effective method driven by Dirichlet processes. We propose both a completely unsupervised and a semi-supervised approach for Dirichlet densification. We also contribute with new random walkers (Return Random Walks) that are useful structural filters as well as asymmetry detectors in directed brain networks used to make early predictions of Alzheimer's disease (AD). Graph similarity is addressed by means of designing structural information channels as a means of measuring the Mutual Information between graphs. To this end, we first embed the graphs by means of Commute Times. Commute times embeddings have good properties for Delaunay triangulations (the typical representation for Graph Matching in computer vision). This means that these embeddings can act as encoders in the channel as well as decoders (since they are invertible). Consequently, structural noise can be modelled by the deformation introduced in one of the manifolds to fit the other one. This methodology leads to a very high discriminative similarity measure, since the Mutual Information is measured on the manifolds (vectorial domain) through copulas and bypass entropy estimators. This is consistent with the methodology of decoupling the measurement of graph similarity in two steps: a) linearizing the Quadratic Assignment Problem (QAP) by means of the embedding trick, and b) measuring similarity in vector spaces. The QAP problem is also investigated in this thesis. More precisely, we analyze the behaviour of $m$-best Graph Matching methods. These methods usually start by a couple of best solutions and then expand locally the search space by excluding previous clamped variables. The next variable to clamp is usually selected randomly, but we show that this reduces the performance when structural noise arises (outliers). Alternatively, we propose several heuristics for spanning the search space and evaluate all of them, showing that they are usually better than random selection. These heuristics are particularly interesting because they exploit the structure of the affinity matrix. Efficiency is improved as well. Concerning the application domains explored in this thesis we focus on object recognition (graph similarity), clustering (rewiring), compression/decompression of graphs (links with Extremal Graph Theory), 3D shape simplification (sparsification) and early prediction of AD. / Ministerio de Economía, Industria y Competitividad (Referencia TIN2012-32839 BES-2013-064482) Graph densification Cut similarity Spectral clustering Dirichlet problems Random walkers Commute Times Graph algorithms Regular Partition Szemeredi Alzheimer's disease Graphs Return Random Walk Net4lap Directed graphs Spectral graph theory Graph entropy Mutual information Manifold alignment m-Best Graph Matching Binary-Tree Partitions QAP Graph sparsification Shape simplification Alpha shapes

Search results

Algorithmes pour la prédiction in silico d'interactions par similarité entre macromolécules biologiques / Similarity-based algorithms for the prediction of interactions between biomolecules

Structural Similarity: Applications to Object Recognition and Clustering