• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 9
  • 4
  • 2
  • Tagged with
  • 21
  • 21
  • 9
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Modélisation de réseaux d'interactions des microARN et analyse et validation expérimentale de leurs boucles minimales avec des facteurs de transcription

Lisi, Véronique 12 1900 (has links)
Les microARN (miARN) sont de petits ARN non-codants qui répriment la traduction de leurs gènes cibles par hybridation à leur ARN messager (ARNm). L'identification de cibles biologiquement actives de miARN est cruciale afin de mieux comprendre leurs rôles. Ce problème est cependant difficile parce que leurs sites ne sont définis que par sept nucléotides. Dans cette thèse je montre qu'il est possible de modéliser certains aspects des miARN afin d'identifier leurs cibles biologiquement actives à travers deux modélisations d'un aspect des miARN. La première modélisation s'intéresse aux aspects de la régulation des miARN par l'identification de boucles de régulation entre des miARN et des facteurs de transcription (FT). Cette modélisation a permis, notamment, d'identifier plus de 700 boucles de régulation miARN/FT, conservées entre l'humain et la souris. Les résultats de cette modélisation ont permis, en particulier, d'identifier deux boucles d'auto-régulation entre LMO2 et les miARN miR-223 et miR-363. Des expériences de transplantation de cellules souches hématopoïétiques et de progéniteurs hématopoïétiques ont ensuite permis d'assigner à ces deux miARN un rôle dans la détermination du destin cellulaire hématopoïétique. La deuxième modélisation s'intéresse directement aux interactions des miARN avec les ARNm afin de déterminer les cibles des miARN. Ces travaux ont permis la mise au point d'une méthode simple de prédiction de cibles de miARN dont les performances sont meilleures que les outils courant. Cette modélisation a aussi permis de mettre en lumière certaines conséquences insoupçonnées de l'effet des miARN, telle que la spécificité des cibles de miARN au contexte cellulaire et l'effet de saturation de certains ARNm par les miARN. Cette méthode peut également être utilisée pour identifier des ARNm dont la surexpression fait augmenter un autre ARNm par l'entremise de miARN partagés et dont les effets sur les ARNm non ciblés seraient minimaux. / microRNAs (miRNAs) are small non coding RNAs that repress the translation of their target genes by pairing to their messenger RNA (mRNA). The identification of miRNAs' biologically active targets is a difficult problem because their binding sites are defined by only seven nucleotides. In this thesis, I show that it is possible to model specific aspects of miRNAs to identify their biologically active targets through two modeling of each one aspect of miRNAs. The first modeling considers the miRNAs regulations through the identification of regulatory loops between miRNAs and transcription factors (TFs). Through this modeling, we identified over 700 miRNA/TF regulatory loops conserved between human and mouse. With the results of this modeling, we were able to identify, in particular, two regulatory loops between LMO2 and the miRNAs miR-223 and miR-363. Using hematopoietic stem cells and progenitor cells transplantation experiment we showed that miR-223 and miR-363 are involved in hematopoietic cell fate determination. The second modeling focuses directly on the interaction between miARN and messenger RNA (mRNA) to determine the miRNA targets. With this work, we developed a simple method for predicting miRNA targets that outperforms the current state of the art tool. This modeling also highlighted some unsuspected consequences of miRNA effects such as the cell context specificity and the saturation of mRNA targets by miRNA. This method can also be used to identify mRNAs whose overexpression increases the expression level of another mRNA through their shared miRNA and whose global effects on other genes are minimal.
12

Network Analysis and Comparative Phylogenomics of MicroRNAs and their Respective Messenger RNA Targets Using Twelve Drosophila species

Woodcock, M Ryan 17 November 2010 (has links)
MicroRNAs represent a special class of small (~21–25 nucleotides) non-coding RNA molecules which exert powerful post-transcriptional control over gene expression in eukaryotes. Indeed microRNAs likely represent the most abundant class of regulators in animal gene regulatory networks. This study describes the recovery and network analyses of a suite of homologous microRNA targets recovered through two different prediction methods for whole gene regions across twelve Drosophila species. Phylogenetic criteria under an accepted tree topology were used as a reference frame to 1) make inference into microRNA-target predictions, 2) study mathematical properties of microRNA-gene regulatory networks, 3) and conduct novel phylogenetic analyses using character data derived from weighted edges of the microRNA-target networks. This study investigates the evidences of natural selection and phylogenetic signatures inherent within the microRNA regulatory networks and quantifies time and mutation necessary to rewire a microRNA regulatory network. Selective factors that appear to operate upon seed aptamers include cooperativity (redundancy) of interactions and transcript length. Topological analyses of microRNA regulatory networks recovered significant enrichment for a motif possessing a redundant link in all twelve species sampled. This would suggest that optimization of the whole interactome topology itself has been historically subject to natural selection where resilience to attack have offered selective advantage. It seems that only a modest number of microRNA–mRNA interactions exhibit conservation over Drosophila cladogenesis. The decrease in conserved microRNA-target interactions with increasing phylogenetic distance exhibited a cure typical of a saturation phenomena. Scale free properties of a network intersection of microRNA target predictions methods were found to transect taxonomic hierarchy.
13

Développement de méthodes bioinformatiques dédiées à la prédiction et l'analyse des réseaux métaboliques et des ARN non codants / Development of bioinformatic methods dedicated to the prediction and the analysis of metabolic networks and non-coding RNA

Ghozlane, Amine 20 November 2012 (has links)
L'identification des interactions survenant au niveau moléculaire joue un rôle crucial pour la compréhension du vivant. L'objectif de ce travail a consisté à développer des méthodes permettant de modéliser et de prédire ces interactions pour le métabolisme et la régulation de la transcription. Nous nous sommes basés pour cela sur la modélisation de ces systèmes sous la forme de graphes et d'automates. Nous avons dans un premier temps développé une méthode permettant de tester et de prédire la distribution du flux au sein d'un réseau métabolique en permettant la formulation d'une à plusieurs contraintes. Nous montrons que la prise en compte des données biologiques par cette méthode permet de mieux reproduire certains phénotypes observés in vivo pour notre modèle d'étude du métabolisme énergétique du parasite Trypanosoma brucei. Les résultats obtenus ont ainsi permis de fournir des éléments d'explication pour comprendre la flexibilité du flux de ce métabolisme, qui étaient cohérentes avec les données expérimentales. Dans un second temps, nous nous sommes intéressés à une catégorie particulière d'ARN non codants appelés sRNAs, qui sont impliqués dans la régulation de la réponse cellulaire aux variations environnementales. Nous avons développé une approche permettant de mieux prédire les interactions qu'ils effectuent avec d'autres ARN en nous basant sur une prédiction des interactions, une analyse par enrichissement du contexte biologique de ces cibles, et en développant un système de visualisation spécialement adapté à la manipulation de ces données. Nous avons appliqué notre méthode pour l'étude des sRNAs de la bactérie Escherichia coli. Les prédictions réalisées sont apparues être en accord avec les données expérimentales disponibles, et ont permis de proposer plusieurs nouvelles cibles candidates. / The identification of the interactions occurring at the molecular level is crucial to understand the life process. The aim of this work was to develop methods to model and to predict these interactions for the metabolism and the regulation of transcription. We modeled these systems by graphs and automata.Firstly, we developed a method to test and to predict the flux distribution in a metabolic network, which consider the formulation of several constraints. We showed that this method can better mimic the in vivo phenotype of the energy metabolism of the parasite Trypanosoma brucei. The results enabled to provide a good explanation of the metabolic flux flexibility, which were consistent with the experimental data. Secondly, we have considered a particular class of non-coding RNAs called sRNAs, which are involved in the regulation of the cellular response to environmental changes. We developed an approach to better predict their interactions with other RNAs based on the interaction prediction, an enrichment analysis, and by developing a visualization system adapted to the manipulation of these data. We applied our method to the study of the sRNAs interactions within the bacteria Escherichia coli. The predictions were in agreement with the available experimental data, and helped to propose several new target candidates.
14

Machine Learning Methods For Using Network Based Information In Microrna Target Prediction

Sualp, Merter 01 February 2013 (has links) (PDF)
Computational microRNA (miRNA) target identification in animal genomes is a challenging problem due to the imperfect pairing of the miRNA with the target site. Techniques based on sequence alone are prone to produce many false positive interactions. Therefore, integrative techniques have been developed to utilize additional genomic, structural features, and evolu- tionary conservation information for reducing the high false positive rate. We propose that the context of a putative miRNA target in a protein-protein interaction (PPI) network can be used as an additional filter in a computational miRNA target pr ediction algorithm. We compute several graph theoretic measures on human PPI network as indicators of network context. We assess the performance of individual and combined contextual measures in increasing the precision of a popular miRNA target prediction tool, TargetScan, using low throughput and high throughput datasets of experimentally verified human miRNA targets. We used clas- sification algorithms for that assessment. Since there exists only miRNA targets as training samples, this problem becomes a One Class Classification (OCC) problem. We devised a novel OCC method, DiVo, based on simple distance metrics and voting. Comparative analysis with the state of the art methods show that, DiVo attains better classification performance. Our eventual results indicate that topological properties of target gene products in PPI networks are valuable sources of information for filtering out false positive miRNA target genes. We show that, for targets of a number of miRNAs, netwo rk context correlates better with being a target compared to a sequence based score provided by the prediction tool.
15

Machine Learning Methods For Using Network Based Information In Microrna Target Prediction

Sualp, Merter 01 February 2013 (has links) (PDF)
Computational microRNA (miRNA) target identification in animal genomes is a challenging problem due to the imperfect pairing of the miRNA with the target site. Techniques based on sequence alone are prone to produce many false positive interactions. Therefore, integrative techniques have been developed to utilize additional genomic, structural features, and evolu- tionary conservation information for reducing the high false positive rate. We propose that the context of a putative miRNA target in a protein-protein interaction (PPI) network can be used as an additional filter in a computational miRNA target prediction algorithm. We compute several graph theoretic measures on human PPI network as indicators of network context. We assess the performance of individual and combined contextual measures in increasing the precision of a popular miRNA target prediction tool, TargetScan, using low throughput and high throughput datasets of experimentally verified human miRNA targets. We used clas- sification algorithms for that assessment. Since there exists only miRNA targets as training samples, this problem becomes a One Class Classification (OCC) problem. We devised a novel OCC method, DiVo, based on simple distance metrics and voting. Comparative analysis with the state of the art methods show that, DiVo attains better classification performance. Our eventual results indicate that topological properties of target gene products in PPI networks are valuable sources of information for filtering out false positive miRNA target genes. We show that, for targets of a number of miRNAs, network context correlates better with being a target compared to a sequence based score provided by the prediction tool.
16

Modélisation de réseaux d'interactions des microARN et analyse et validation expérimentale de leurs boucles minimales avec des facteurs de transcription

Lisi, Véronique 12 1900 (has links)
Les microARN (miARN) sont de petits ARN non-codants qui répriment la traduction de leurs gènes cibles par hybridation à leur ARN messager (ARNm). L'identification de cibles biologiquement actives de miARN est cruciale afin de mieux comprendre leurs rôles. Ce problème est cependant difficile parce que leurs sites ne sont définis que par sept nucléotides. Dans cette thèse je montre qu'il est possible de modéliser certains aspects des miARN afin d'identifier leurs cibles biologiquement actives à travers deux modélisations d'un aspect des miARN. La première modélisation s'intéresse aux aspects de la régulation des miARN par l'identification de boucles de régulation entre des miARN et des facteurs de transcription (FT). Cette modélisation a permis, notamment, d'identifier plus de 700 boucles de régulation miARN/FT, conservées entre l'humain et la souris. Les résultats de cette modélisation ont permis, en particulier, d'identifier deux boucles d'auto-régulation entre LMO2 et les miARN miR-223 et miR-363. Des expériences de transplantation de cellules souches hématopoïétiques et de progéniteurs hématopoïétiques ont ensuite permis d'assigner à ces deux miARN un rôle dans la détermination du destin cellulaire hématopoïétique. La deuxième modélisation s'intéresse directement aux interactions des miARN avec les ARNm afin de déterminer les cibles des miARN. Ces travaux ont permis la mise au point d'une méthode simple de prédiction de cibles de miARN dont les performances sont meilleures que les outils courant. Cette modélisation a aussi permis de mettre en lumière certaines conséquences insoupçonnées de l'effet des miARN, telle que la spécificité des cibles de miARN au contexte cellulaire et l'effet de saturation de certains ARNm par les miARN. Cette méthode peut également être utilisée pour identifier des ARNm dont la surexpression fait augmenter un autre ARNm par l'entremise de miARN partagés et dont les effets sur les ARNm non ciblés seraient minimaux. / microRNAs (miRNAs) are small non coding RNAs that repress the translation of their target genes by pairing to their messenger RNA (mRNA). The identification of miRNAs' biologically active targets is a difficult problem because their binding sites are defined by only seven nucleotides. In this thesis, I show that it is possible to model specific aspects of miRNAs to identify their biologically active targets through two modeling of each one aspect of miRNAs. The first modeling considers the miRNAs regulations through the identification of regulatory loops between miRNAs and transcription factors (TFs). Through this modeling, we identified over 700 miRNA/TF regulatory loops conserved between human and mouse. With the results of this modeling, we were able to identify, in particular, two regulatory loops between LMO2 and the miRNAs miR-223 and miR-363. Using hematopoietic stem cells and progenitor cells transplantation experiment we showed that miR-223 and miR-363 are involved in hematopoietic cell fate determination. The second modeling focuses directly on the interaction between miARN and messenger RNA (mRNA) to determine the miRNA targets. With this work, we developed a simple method for predicting miRNA targets that outperforms the current state of the art tool. This modeling also highlighted some unsuspected consequences of miRNA effects such as the cell context specificity and the saturation of mRNA targets by miRNA. This method can also be used to identify mRNAs whose overexpression increases the expression level of another mRNA through their shared miRNA and whose global effects on other genes are minimal.
17

SIMTar: uma ferramenta para predição de SNPs interferindo em sítios alvos de microRNAs / SIMTar: a tool for prediction of SNPs interfering on microRNA target sites

Piovezani, Amanda Rusiska 12 September 2013 (has links)
Polimorfismos de um nucleotídeo (SNPs) podem alterar, além de códons de leitura da tradução de genes, posições genômicas importantes do processo de regulação gênica, como sítios de iniciação de transcrição, de splicing e, mais especificamente, sítios alvos de microRNAs (miRNAs). Em parti- cular, a identificação de SNPs alterando sítios alvos de miRNAs é um problema em aberto, embora venha ganhando um importante destaque nos últimos anos decorrente dos avanços descobertos sobre a capacidade dos miRNAs como elementos reguladores do genoma, associados inclusive a muitas doenças como o câncer e vários transtornos psiquiátricos. Os recursos computacionais atualmente disponíveis para esta finalidade (alguns bancos de dados e uma ferramenta) estão restritos à análise de SNPs na região 3UTR (UnTranslated Regions) de RNAs mensageiros, onde miRNAs geralmente se ligam para reprimir sua tradução. No entanto, essa é uma simplificação do problema, dado que já se conhece a regulação por miRNAs ativando ou reprimindo a transcrição gênica quando ligados à sua região promotora, aumentando a efetividade da regulação negativa da tradução quando ligados à região codificante do gene ou ainda miRNAs se ligando a RNAs não codificantes. Esses recursos se limitam também à identificação de SNPs na região seed dos sítios de miRNAs, e portanto só identificam criação ou ruptura de sítios. Porém, SNPs localizados fora dessa região não só podem colaborar na criação e ruptura de sítios como também interferir na estabilidade de ligação de miRNAs e, por- tanto, na efetividade da regulação. Além disso, considerando toda a extensão do sítio, não somente a seed , é possível ocorrer mais de um SNP e, sendo assim, a combinação desses SNPs pode ter uma influência ainda maior na ligação com o miRNA. Os recursos atuais também não informam quais alelos dos SNPs, muito menos quais combinações deles, estão causando qual efeito. Por fim, tais recursos estão restritos a Homo sapiens e Mus musculus. Assim, este trabalho apresenta a ferramenta computacional SIMTar (SNPs Interfering in MicroRNA Targets), desenvolvida para identificar SNPs que alteram sítios alvos de miRNAs e que preenche as lacunas mencionadas. Além disso, é descrita uma aplicação de SIMTar na análise de 114 SNPs associados à esquizofrenia, na qual todos foram preditos interferindo em sítios alvos de miRNAs. / Single nucleotide polymorphisms (SNPs) can be involved in alteration of not only open reading frame but also important genomic positions of gene regulation process such as transcription initiation sites, splicing sites and microRNA target sites. In particular, the identification of SNPs interfering on microRNA target sites is still an open problem, despite its increasing prominence in recent years due to the discoveries about the microRNA abilities as regulatory elements in the genome and association with severals diseases such as cancer and psychiatric disorders. The computational resources currently available for this purpose (four databases and one tool) are restricted to the analysis of SNPs in the 3UTR (UnTranslated Regions) of mRNAs, where the microRNAs typically bind in order to repress their translation. However, this is a simplification of the problem, since it is already known the gene transcription activation by microRNAs bound to its promoter region, increasing of the effectiveness of negative regulation of translation when microRNAs are bound to the coding region of the gene or binding of microRNA into non-coding RNAs. These resources are also limited to the identification of SNPs in the seed region of miRNAs, and therefore they can only identify sites creation or disruption. However, SNPs located outside this region can not only create and disrupt target sites but also interfere on the stability of miRNAs binding and therefore on the regulation effectiveness. Moreover, considering the target site length, more than one SNP can occur inside of a site and thus, the combination of these SNPs can have an even greater influence on the microRNA binding. Also, current resources do not display which alleles of SNPs or what combinations of them are causing which effect. Finally, these features are restricted to the Homo sapiens and Mus musculus species. This work presents the computational tool SIMTar (SNPs Interfering on MicroRNA Targets), developed to identify SNPs that alter miRNA target sites and fills the mentioned gaps. Finally, it is described an application of SIMTar on the analysis of 114 SNPs associated with schizophrenia, all of them being predicted interfering with miRNA target sites.
18

Probabilistic Models for Collecting, Analyzing, and Modeling Expression Data

Le, Hai-Son Phuoc 01 May 2013 (has links)
Advances in genomics allow researchers to measure the complete set of transcripts in cells. These transcripts include messenger RNAs (which encode for proteins) and microRNAs, short RNAs that play an important regulatory role in cellular networks. While this data is a great resource for reconstructing the activity of networks in cells, it also presents several computational challenges. These challenges include the data collection stage which often results in incomplete and noisy measurement, developing methods to integrate several experiments within and across species, and designing methods that can use this data to map the interactions and networks that are activated in specific conditions. Novel and efficient algorithms are required to successfully address these challenges. In this thesis, we present probabilistic models to address the set of challenges associated with expression data. First, we present a novel probabilistic error correction method for RNA-Seq reads. RNA-Seq generates large and comprehensive datasets that have revolutionized our ability to accurately recover the set of transcripts in cells. However, sequencing reads inevitably contain errors, which affect all downstream analyses. To address these problems, we develop an efficient hidden Markov modelbased error correction method for RNA-Seq data . Second, for the analysis of expression data across species, we develop clustering and distance function learning methods for querying large expression databases. The methods use a Dirichlet Process Mixture Model with latent matchings and infer soft assignments between genes in two species to allow comparison and clustering across species. Third, we introduce new probabilistic models to integrate expression and interaction data in order to predict targets and networks regulated by microRNAs. Combined, the methods developed in this thesis provide a solution to the pipeline of expression analysis used by experimentalists when performing expression experiments.
19

SIMTar: uma ferramenta para predição de SNPs interferindo em sítios alvos de microRNAs / SIMTar: a tool for prediction of SNPs interfering on microRNA target sites

Amanda Rusiska Piovezani 12 September 2013 (has links)
Polimorfismos de um nucleotídeo (SNPs) podem alterar, além de códons de leitura da tradução de genes, posições genômicas importantes do processo de regulação gênica, como sítios de iniciação de transcrição, de splicing e, mais especificamente, sítios alvos de microRNAs (miRNAs). Em parti- cular, a identificação de SNPs alterando sítios alvos de miRNAs é um problema em aberto, embora venha ganhando um importante destaque nos últimos anos decorrente dos avanços descobertos sobre a capacidade dos miRNAs como elementos reguladores do genoma, associados inclusive a muitas doenças como o câncer e vários transtornos psiquiátricos. Os recursos computacionais atualmente disponíveis para esta finalidade (alguns bancos de dados e uma ferramenta) estão restritos à análise de SNPs na região 3UTR (UnTranslated Regions) de RNAs mensageiros, onde miRNAs geralmente se ligam para reprimir sua tradução. No entanto, essa é uma simplificação do problema, dado que já se conhece a regulação por miRNAs ativando ou reprimindo a transcrição gênica quando ligados à sua região promotora, aumentando a efetividade da regulação negativa da tradução quando ligados à região codificante do gene ou ainda miRNAs se ligando a RNAs não codificantes. Esses recursos se limitam também à identificação de SNPs na região seed dos sítios de miRNAs, e portanto só identificam criação ou ruptura de sítios. Porém, SNPs localizados fora dessa região não só podem colaborar na criação e ruptura de sítios como também interferir na estabilidade de ligação de miRNAs e, por- tanto, na efetividade da regulação. Além disso, considerando toda a extensão do sítio, não somente a seed , é possível ocorrer mais de um SNP e, sendo assim, a combinação desses SNPs pode ter uma influência ainda maior na ligação com o miRNA. Os recursos atuais também não informam quais alelos dos SNPs, muito menos quais combinações deles, estão causando qual efeito. Por fim, tais recursos estão restritos a Homo sapiens e Mus musculus. Assim, este trabalho apresenta a ferramenta computacional SIMTar (SNPs Interfering in MicroRNA Targets), desenvolvida para identificar SNPs que alteram sítios alvos de miRNAs e que preenche as lacunas mencionadas. Além disso, é descrita uma aplicação de SIMTar na análise de 114 SNPs associados à esquizofrenia, na qual todos foram preditos interferindo em sítios alvos de miRNAs. / Single nucleotide polymorphisms (SNPs) can be involved in alteration of not only open reading frame but also important genomic positions of gene regulation process such as transcription initiation sites, splicing sites and microRNA target sites. In particular, the identification of SNPs interfering on microRNA target sites is still an open problem, despite its increasing prominence in recent years due to the discoveries about the microRNA abilities as regulatory elements in the genome and association with severals diseases such as cancer and psychiatric disorders. The computational resources currently available for this purpose (four databases and one tool) are restricted to the analysis of SNPs in the 3UTR (UnTranslated Regions) of mRNAs, where the microRNAs typically bind in order to repress their translation. However, this is a simplification of the problem, since it is already known the gene transcription activation by microRNAs bound to its promoter region, increasing of the effectiveness of negative regulation of translation when microRNAs are bound to the coding region of the gene or binding of microRNA into non-coding RNAs. These resources are also limited to the identification of SNPs in the seed region of miRNAs, and therefore they can only identify sites creation or disruption. However, SNPs located outside this region can not only create and disrupt target sites but also interfere on the stability of miRNAs binding and therefore on the regulation effectiveness. Moreover, considering the target site length, more than one SNP can occur inside of a site and thus, the combination of these SNPs can have an even greater influence on the microRNA binding. Also, current resources do not display which alleles of SNPs or what combinations of them are causing which effect. Finally, these features are restricted to the Homo sapiens and Mus musculus species. This work presents the computational tool SIMTar (SNPs Interfering on MicroRNA Targets), developed to identify SNPs that alter miRNA target sites and fills the mentioned gaps. Finally, it is described an application of SIMTar on the analysis of 114 SNPs associated with schizophrenia, all of them being predicted interfering with miRNA target sites.
20

Improved in silico methods for target deconvolution in phenotypic screens

Mervin, Lewis January 2018 (has links)
Target-based screening projects for bioactive (orphan) compounds have been shown in many cases to be insufficiently predictive for in vivo efficacy, leading to attrition in clinical trials. Phenotypic screening has hence undergone a renaissance in both academia and in the pharmaceutical industry, partly due to this reason. One key shortcoming of this paradigm shift is that the protein targets modulated need to be elucidated subsequently, which is often a costly and time-consuming procedure. In this work, we have explored both improved methods and real-world case studies of how computational methods can help in target elucidation of phenotypic screens. One limitation of previous methods has been the ability to assess the applicability domain of the models, that is, when the assumptions made by a model are fulfilled and which input chemicals are reliably appropriate for the models. Hence, a major focus of this work was to explore methods for calibration of machine learning algorithms using Platt Scaling, Isotonic Regression Scaling and Venn-Abers Predictors, since the probabilities from well calibrated classifiers can be interpreted at a confidence level and predictions specified at an acceptable error rate. Additionally, many current protocols only offer probabilities for affinity, thus another key area for development was to expand the target prediction models with functional prediction (activation or inhibition). This extra level of annotation is important since the activation or inhibition of a target may positively or negatively impact the phenotypic response in a biological system. Furthermore, many existing methods do not utilize the wealth of bioactivity information held for orthologue species. We therefore also focused on an in-depth analysis of orthologue bioactivity data and its relevance and applicability towards expanding compound and target bioactivity space for predictive studies. The realized protocol was trained with 13,918,879 compound-target pairs and comprises 1,651 targets, which has been made available for public use at GitHub. Consequently, the methodology was applied to aid with the target deconvolution of AstraZeneca phenotypic readouts, in particular for the rationalization of cytotoxicity and cytostaticity in the High-Throughput Screening (HTS) collection. Results from this work highlighted which targets are frequently linked to the cytotoxicity and cytostaticity of chemical structures, and provided insight into which compounds to select or remove from the collection for future screening projects. Overall, this project has furthered the field of in silico target deconvolution, by improving the performance and applicability of current protocols and by rationalizing cytotoxicity, which has been shown to influence attrition in clinical trials.

Page generated in 0.0966 seconds