• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 52
  • 6
  • 6
  • 5
  • 5
  • 3
  • 2
  • Tagged with
  • 100
  • 100
  • 33
  • 24
  • 24
  • 17
  • 14
  • 14
  • 13
  • 10
  • 10
  • 9
  • 9
  • 9
  • 9
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Protein Binding Site Similarities as Driver for Drug Repositioning

Haupt, Joachim 28 May 2014 (has links)
Drug repositioning applies existing drugs to new disease indications. A prerequisite for drug repurposing is drug promiscuity - a drug's ability to bind to several targets, possibly leading to side effects on the other hand. One reason for drug promiscuity is binding site similarity between (otherwise unrelated) proteins. In this thesis, a new algorithm for remote binding site similarity assessment and its application to the whole of the Protein Data Bank (PDB) is presented, forming the base for off-target identification and drug repositioning. The present thesis contributes to a long-standing debate on the reasons for drug promiscuity, being one of the pioneer studies investigating these from a protein structural point of view. Except for a small influence of flexibility, the analysis of all promiscuous drugs in the PDB revealed that drug properties are of minor importance. However, a strong correlation between promiscuity and binding site similarity of protein targets is found (r = 0.81), suggesting binding site similarity as the main reason for drug promiscuity. For 71 % of the promiscuous drugs at least one pair of their targets' binding sites is similar and for 18 % all are similar. In order to overcome issues in detection of remotely similar binding sites, a score for binding site similarity is developed: LigandRMSD measures the similarity of the aligned ligands and uncovers remote local similarities in proteins. It can be applied to arbitrary binding site alignments and also works on distinct ligands on a structural proteome scale. To answer the question on which other targets might be hit when targeting a particular protein, an all-to-all binding site alignment of 32,202 protein structures is analyzed. Of the hundreds of million possible protein pairs, 0.27 % were found to have similar binding sites. Extrapolating to the human proteome, for one human protein are 54 proteins with a similar binding site expected on average. Clearly, this is in contrast to the one drug-one target paradigm in drug development. Based on these data, disadvantageous off-targets can be uncovered and drug-repositioning candidates inferred. The enormous potential is demonstrated with the example of Viagra, proposing it for repositioning to Alzheimer's disease and prostate cancer. The findings in this thesis question the established single-target dogma in drug discovery. Drugs are triggered to modulate multiple targets simultaneously by the widespread binding site similarity. With the presented pipeline, drug targets can be reliably predicted: Starting from a target protein, additional targets are predicted based on binding site similarity and prioritized according to the resulting ligand structural overlap. Identifying drug targets helps to understand severe side effects and opens the door for drug repositioning.
72

Improving protein docking with binding site prediction

Huang, Bingding 10 July 2008 (has links)
Protein-protein and protein-ligand interactions are fundamental as many proteins mediate their biological function through these interactions. Many important applications follow directly from the identification of residues in the interfaces between protein-protein and protein-ligand interactions, such as drug design, protein mimetic engineering, elucidation of molecular pathways, and understanding of disease mechanisms. The identification of interface residues can also guide the docking process to build the structural model of protein-protein complexes. This dissertation focuses on developing computational approaches for protein-ligand and protein-protein binding site prediction and applying these predictions to improve protein-protein docking. First, we develop an automated approach LIGSITEcs to predict protein-ligand binding site, based on the notion of surface-solvent-surface events and the degree of conservation of the involved surface residues. We compare our algorithm to four other approaches, LIGSITE, CAST, PASS, and SURFNET, and evaluate all on a dataset of 48 unbound/bound structures and 210 bound-structures. LIGSITEcs performs slightly better than the other tools and achieves a success rate of 71% and 75%, respectively. Second, for protein-protein binding site, we develop metaPPI, a meta server for interface prediction. MetaPPI combines results from a number of tools, such as PPI_Pred, PPISP, PINUP, Promate, and SPPIDER, which predict enzyme-inhibitor interfaces with success rates of 23% to 55% and other interfaces with 10% to 28% on a benchmark dataset of 62 complexes. After refinement, metaPPI significantly improves prediction success rates to 70% for enzyme-inhibitor and 44% for other interfaces. Third, for protein-protein docking, we develop a FFT-based docking algorithm and system BDOCK, which includes specific scoring functions for specific types of complexes. BDOCK uses family-based residue interface propensities as a scoring function and obtains improvement factors of 4-30 for enzyme-inhibitor and 4-11 for antibody-antigen complexes in two specific SCOP families. Furthermore, the degrees of buriedness of surface residues are integrated into BDOCK, which improves the shape discriminator for enzyme-inhibitor complexes. The predicted interfaces from metaPPI are integrated as well, either during docking or after docking. The evaluation results show that reliable interface predictions improve the discrimination between near-native solutions and false positive. Finally, we propose an implicit method to deal with the flexibility of proteins by softening the surface, to improve docking for non enzyme-inhibitor complexes.
73

Primer tRNA annealing by human immunodeficiency virus type 1

Jones, Christopher P. 25 June 2012 (has links)
No description available.
74

Thermodynamics of λ-PCR Primer Design and Effective Ribosome Binding Sites

Berg, Emily Katherine 07 June 2019 (has links)
Recombinant DNA technology has been commonly used in a number of fields to synthesize new products or generate products with a new pathway. Conventional cloning methods are expensive and require significant time and labor; λ-PCR, a new cloning method developed in the Senger lab, has a number of advantages compared to other cloning processes due to its employment of relatively inexpensive and widely available materials and time-efficiency. While the amount of lab work required for the cloning process is minimal, the importance of accurate primer design cannot be overstated. The target of this study was to create an effective procedure for λ-PCR primer design that ensures accurate cloning reactions. Additionally, synthetic ribosome binding sites (RBS) were included in the primer designs to test heterologous protein expression of the cyan fluorescent reporter with different RBS strengths. These RBS sequences were designed with an online tool, the RBS Calculator. A chimeric primer design procedure for λ-PCR was developed and shown to effectively create primers used for accurate cloning with λ-PCR; this method was used to design primers for CFP cloning in addition to two enzymes cloned in the Senger lab. A total of five strains of BL21(DE3) with pET28a + CFP were constructed, each with the same cyan fluorescent protein (CFP) reporter but different RBS sequences located directly upstream of the start codon of the CFP gene. Expression of the protein was measured using both whole-cell and cell-free systems to determine which system yields higher protein concentrations. A number of other factors were tested to optimize conditions for high protein expression, including: induction time, IPTG concentration, temperature, and media (for the cell-free experiments only). Additionally, expression for each synthetic RBS sequence was investigated to determine an accurate method for predicting protein translation. NUPACK and the Salis Lab RBS Calculator were both used to evaluate the effects of these different synthetic RBS sequences. The results of the plate reader experiments with the 5 CFP strains revealed a number of factors to be statistically significant when predicting protein expression, including: IPTG concentration, induction time, and in the cell-free experiments, type of media. The whole-cell system consistently produced higher amounts of protein than the cell-free system. Lastly, contrasts between the CFP strains showed each strain's performance did not match the predictions from the RBS Calculator. Consequently, a new method for improving protein expression with synthetic RBS sequences was developed using relationships between Gibbs free energy of the RBS-rRNA complex and expression levels obtained through experimentation. Additionally, secondary structure present at the RBS in the mRNA transcript was modeled with strain expression since these structures cause deviations in the relationship between Gibbs free energy of the mRNA-rRNA complex and CFP expression. / Master of Science / Recombinant DNA technology has been used to genetically enhance organisms to produce greater amounts of a product already made by the organism or to make an organism synthesize a new product. Genes are commonly modified in organisms using cloning practices which typically involves inserting a target gene into a plasmid and transforming the plasmid into the organism of interest. A new cloning process developed in the Senger lab, λ-PCR, improves the cloning process compared to other methods due to its use of relatively inexpensive materials and high efficiency. A primary goal of this study was to develop a procedure for λ-PCR primer design that allows for accurate use of the cloning method. Additionally, this study investigated the use of synthetic ribosome binding sites to control and improve expression of proteins cloned into an organism. Ribosome binding sites are sequences located upstream of the gene that increase the molecule’s affinity for the rRNA sequence on the ribosome, bind to the ribosome just upstream of the beginning of the gene, and initiate expression of the gene. Tools have been developed that create synthetic ribosome binding sites designed to produce specific amounts of protein. For example, the tools can increase or decrease expression of a gene depending on the application. These tools, the Salis Lab RBS Calculator and NUPACK, were used to design and evaluate the effects of the synthetic ribosome binding sites. Additionally, a new method was created to design synthetic ribosome binding sites since the methods used during the design process yielded inaccuracies. Each strain of E. coli contained the same gene, a cyan fluorescent protein (CFP), but had different RBS sequences located upstream of the gene. Expression of CFP was controlled via induction, meaning the addition of a particular molecule, IPTG in this system, triggered expression of CFP. Each of the CFP strains were tested with a variety of v conditions in order to find the conditions most suitable for protein expression; the variables tested include: induction time, IPTG (inducer) concentration, and temperature. Media was also tested for the cell-free systems, meaning the strains were grown overnight for 18 hours and lysed, a process where the cell membrane is broken in order to utilize the cell’s components for protein expression; the cell lysate was resuspended in new media for the experiments. ANOVA and multiple linear regression revealed IPTG concentration, induction time, and media to be significant factors impacting protein expression. This analysis also showed each CFP strain did not perform as the RBS Calculator predicted. Modeling each strain’s CFP expression using the RBS-rRNA binding strengths and secondary structures present in the RBS allowed for the creation of a new model for predicting and designing RBS sequences.
75

Caractérisation et ciblage de la reconnaissance dynamique de Trp37-G lors de l’interaction de la protéine NCp7 de HIV-1 avec des acides nucléiques / Characterization and targeting the dynamic recognition of Trp37-G during the interaction of NCp7 protein of HIV-1 with nucleic acids

Sharma, Rajhans 10 April 2018 (has links)
La protéine de la nucléocapside (NC) possède un rôle important dans le cycle de viral du VIH-1 grâce à sa propriété chaperone des acides nucléiques (NA) qui implique la reconnaissance de son résidu Trp37 avec un résidu Guanine de l'acide nucléique cible. Nous avons caractérisé cette reconnaissance dynamique Trp37-G en utilisant des séquences impliquées dans la transcription inverse et l'assemblage de l'ARN génomique. En utilisant les analogues nucléosidiques fluorescents thienoguanosine (thG) et 2-thiényl-3-hydroxychromone (3HCnt), nous avons déterminé l'ensemble des constantes de vitesse cinétiques du mécanisme d’hybridation de la séquence (-)PBS avec (+)PBS en absence et en présence de NC. Nous avons également étudié le rôle du NA sucre dans les complexes NC-ARN et NC-ADN, puisque la protéine NC se lie avec la polarité opposée aux séquences d'ADN et d'ARN. Nous avons confirmé que l'interaction du résidu Trp37 avec les amino-acides de type guanines était critique lors de la formation des complexes avec les deux mutants d’ARN et d’ADN de PBS et de SL3. Enfin, nous avons réalisé un criblage de potentiels inhibiteurs de la protéine NC et examiné les touches identifiées à partir d’un test basé sur la fluorescence de la sonde thG. / Nucleocapsid protein (NC) plays crucial roles in HIV-1 life cycle through its nucleic acid (NA) chaperoning property that involves recognition of it’s Trp37 residue with a Guanine residue of the target nucleic acid sequences. Herein, we characterized this dynamic Trp37-G recognition with sequences involved in reverse transcription and genomic RNA packaging. Using the fluorescent thienoguanosine (thG) and 2-thienyl-3-hydroxychromone (3HCnt) nucleoside analogues, we determined the whole set of kinetic rate constants for annealing of (-)PBS with (+)PBS in the absence and presence of NC. We also investigated the role of NA sugar in NC-RNA and NC-DNA complexes, as NC binds with opposite polarity to DNA and RNA sequences. We confirmed that the interaction of the Trp37 residue with guanines was critical for the formation of complexes with both RNA and DNA variants of PBS and SL3. Finally, we performed screening of NC inhibitors and tested the selected hits on a thG-based assay.
76

Computational Methods for Inferring Transcription Factor Binding Sites

Morozov, Vyacheslav 11 October 2012 (has links)
Position weight matrices (PWMs) have become a tool of choice for the identification of transcription factor binding sites in DNA sequences. PWMs are compiled from experimentally verified and aligned binding sequences. PWMs are then used to computationally discover novel putative binding sites for a given protein. DNA-binding proteins often show degeneracy in their binding requirement, the overall binding specificity of many proteins is unknown and remains an active area of research. Although PWMs are more reliable predictors than consensus string matching, they generally result in a high number of false positive hits. A previous study introduced a novel method to PWM training based on the known motifs to sample additional putative binding sites from a proximal promoter area. The core idea was further developed, implemented and tested in this thesis with a large scale application. Improved mono- and dinucleotide PWMs were computed for Drosophila melanogaster. The Matthews correlation coefficient was used as an optimization criterion in the PWM refinement algorithm. New PWMs keep an account of non-uniform background nucleotide distributions on the promoters and consider a larger number of new binding sites during the refinement steps. The optimization included the PWM motif length, the position on the promoter, the threshold value and the binding site location. The obtained predictions were compared for mono- and dinucleotide PWM versions with initial matrices and with conventional tools. The optimized PWMs predicted new binding sites with better accuracy than conventional PWMs.
77

Targeting The CD4 Biniding Site In HIV-1 Immunogen Design

Bhattacharyya, Sanchari 07 1900 (has links) (PDF)
Over three decades have passed since the discovery of HIV-1, yet an AIDS vaccine remains elusive. The envelope glycoprotein of HIV-1 gp120, is the most exposed protein on the viral surface and thus serves as an important target for vaccine design. However, various factors like high mutability of gp120, extensive glycosylation and very high conformational flexibility of gp120 have confounded all efforts to design a suitable immunogen that elicits broad and potent neutralizing antibodies against HIV-1. In Chapter 1, a brief description of the structural organization of HIV-1 along with the progress made and the difficulties encountered in the development of a vaccine are presented. In Chapter 2, the design and characterization of an outer domain immunogen of HIV-1 gp120 is discussed. The outer domain (OD) of the envelope glycoprotein gp120 is an important target for vaccine design since it contains a number of conserved epitopes, including a large fraction of the CD4 binding site. Attempts to design OD based immunogens in the past have met with little success. In this work, we designed an OD immunogen based on the sequence of the HXBc2 strain, expressed and purified it from E. coli (ODEC). The ODEC molecule lacks the variable loops V1V2 and V3 and incorporates 11 designed mutations at the interface of the inner and the outer domains of gp120 to increase solubility. Biophysical studies showed that ODEC is folded and protease resistant while ODEC lacking the designed mutations is highly aggregation prone. In contrast to previously characterized OD constructs, ODEC bound CD4 and the broadly neutralizing antibody b12 with micromolar affinities, but not the non-neutralizing antibodies b6 and F105. Further improvement in the refolding protocol yielded a better structured molecule that bound CD4, b12 and VRC01 with sub-micromolar affinities. In rabbit immunization studies with animals primed with ODEC and boosted with gp120, the sera are able to neutralize Tier I viruses and some Tier II viruses like JRFL and RHPA with measurable IC50s. This is one of the first examples of a gp120 fragment based immunogen which was able to elicit sera that showed modest neutralization of some Tier II viruses. Subsequently amide hydrogen-deuterium exchange studies of ODEC showed that though the molecule is well-folded, it is labile to exchange. This might indicate why ODEC does not elicit high amounts of neutralizing antibodies. In Chapter 3, we report the design and characterization of two smaller fragments of gp120 (b121a and b122a) to target the epitope of the broadly neutralizing antibody b12. The region chosen comprised of a compact beta barrel in the lower part of the outer domain of gp120. Unlike ODEC, the fragments corresponding to these constructs were not contiguous stretches in gp120. Thus we used linkers to connect them. Further, nine designed mutations were introduced at exposed hydrophobic regions of the fragment to increase its solubility. The designed protein fragments were expressed in E. coli in order to prevent glycosylation and consequent epitope masking that might occur if expressed in an eukaryotic expression system. Biophysical studies showed that b121a/b122a are partially folded. Disulfide mapping studies showed that the expected disulfide bridges were formed. The designed immunogens could bind b12, but not the non-neutralizing antibody b6. Sera from rabbits primed with b121a/b122a protein fragments and boosted with full-length gp120 showed broad neutralizing activity against a 20 virus panel including Tier2 and 3 viruses such as PVO4, CAAN, CAP45 and ZM233. Sera from animals that received only gp120 showed substantially decreased breadth and potency. Serum depletion studies confirmed that neutralization was gp120 directed and that a substantial fraction of it was mediated by CD4 binding site (CD4bs) antibodies. The data demonstrate that it is possible to elicit broadly neutralizing sera against HIV-1 in small animals, despite the restricted germline VH gene usage observed so far in broadly neutralizing CD4bs directed antibodies in humans. In Chapter 3, we also discuss design of a new construct b122d, which includes regions corresponding to b121a, but with linker connectivities similar to b122a. It was found to bind b12 with sub-micromolar affinity and also showed proteolytic resistance comparable to b121a. This indicated that though b121a showed better proteolytic resistance than b122a, it bound b12 poorly because one of the linkers might sterically occlude the b12 binding site. As the b12 binding site constructs based on the subtype B HXBc2 sequence elicited neutralizing antibodies, we chose to design similar constructs based on a subtype C sequence. The proteins (Cb122a and Cb122d) were purified from E. coli, characterized and found to bind b12 with micromolar affinity. The new constructs (b122d, Cb122a, Cb122d) will shortly be tested in animal immunizations. Disulfides are known to stabilize proteins by reducing the entropy of the unfolded state. In Chapter 4, we attempted to stabilize b122a by engineering disulfides. The disulfides are expected to rigidify the molecule and possibly improve its ability to elicit neutralizing antibodies. Some of the disulfides tested in b122a were predicted based on stereo-chemical criteria by the program MODIP (Modeling Disulfide Bridges in Proteins), while others were chosen at non-hydrogen bonded positions (NHB) on anti-parallel beta strands, based on earlier studies in the lab. Some of the disulfide mutants showed better binding to b12 and increased protection to enzymatic digestion. These disulfides were subsequently engineered into other b12 binding site constructs, namely b122d, Cb122a and Cb122d and these were biophysically characterized. Amongst the various disulfides that were tested in b122a, the one at 293-448 (according to HxBc2 numbering) was found to improve the binding to b12 by about ~16-fold. Not only did this disulfide improve the binding of b122a to b12, it also showed similar improvement in case of b122d and both the subtype C constructs tested. Moreover, since the position 293-448 is an exposed NHB position of an anti-parallel beta strand, spontaneous formation of the disulfide and the improved binding to b12 for all the proteins tested reinforces the fact that cysteines engineered at such positions leads to formation of a stabilizing disulfide. All the proteins containing the 293-448 disulfide will be used in future for rabbit immunization studies to examine if they elicit better neutralizing antibodies than the parent b122a molecule. As discussed in Chapter 2, ODEC showed a very fast rate of hydrogen exchange, indicating that it is flexible. As the 293-448 disulfide improved the binding of b12 binding site constructs, in Chapter 5, disulfides at exposed NHB positions were introduced in the context of ODEC. Previously engineered inter-domain disulfides have been shown to reduce the conformational flexibility of gp120. The disulfides in the lower beta barrel of the outer domain which harbors the CD4 binding site were found to be monomeric, oxidized and could bind neutralizing CD4bs antibodies better than the WT protein. On the other hand, the disulfides in the upper barrel of the outer domain were aggregated and bound antibodies poorly compared to the WT protein, indicating that this part of the molecule may not be well structured in the fragment. However, there was no significant change in the hydrogen exchange kinetics for these mutants. Mutations in the Phe-43 cavity of gp120 (S375W/T257S) which constrain gp120 in the CD4 bound conformation were also tested in ODEC (ODEC-CF). This protein was found to bind CD4 and VRC01 about 8 and 2 times better respectively than WT ODEC. These improved immunogens will be used shortly in rabbit immunization studies. In an attempt to improve the immunogenicity of the gp120 fragment proteins, b121a, b122a and ODEC were displayed on/conjugated to the surface of Qβ virus like particles in Chapter 6. Exposed single cysteine mutants of these proteins were purified, characterized biophysically and found to have the single cysteine free for conjugation. These were subsequently conjugated to the Qβ virus like particles through click chemistry (carried out in Prof. MG Finn’s lab at TSRI), purified and used for rabbit immunization studies. The gp120 ELISA titers of the elicited sera showed that conjugation may be a better option to display foreign antigens on the surface than genetic fusion. There was no difference in the ELISA titers with and without adjuvant, indicating that the particles are sufficiently immunogenic in themselves. Sera from these studies will be tested in neutralization assays. The overall utility of the particle based display approach will be assessed by comparing neutralization data from particle based immunizations to identical immunizations with unconjugated immunogens. Most HIV-1 broadly neutralizing antibodies are directed against the gp120 subunit of the Env surface protein. Native Env consists of a trimer of gp120:gp41 heterodimers, and in contrast to monomeric gp120, preferentially binds CD4 binding site (CD4bs) directed neutralizing antibodies over non-neutralizing ones. Some cryo-electron tomography studies have suggested that the V1V2 loop regions of gp120 are located close to the trimer interface. To understand this further, in Chapter 7, we have designed cyclically permuted variants of gp120 with and without the h-CMP and SUMO2a trimerization domains inserted into the V1/V2 loop. h-CMP-V1cyc is one such variant where 153 and 142 are the N and C terminal residues of cyclically permuted gp120 and h-CMP is fused to the N-terminus. This molecule forms a trimer under native conditions and binds CD4 and the neutralizing CD4bs antibodies b12 with significantly higher affinity relative to wtgp120. It binds the non-neutralizing CD4bs antibody F105 with lower affinity than gp120. A similar derivative, h-CMP-V1cyc1 bound the V1V2 directed broadly neutralizing antibodies PG9/PG16 with ~20 fold higher affinity compared to wild type JRCSF gp120. These cyclic permutants of gp120 are properly folded and are potential immunogens. The data also support Env models in which the V1V2 loops are proximal to the trimer interface. In Appendix A1, peptide analogs of selected secondary structural elements of gp120 were designed. Some of them were grafted on known scaffold proteins. The synthesized peptides were characterized biophysically. Most of the peptides did not have a well-defined secondary structure, indicating that they are not stable in isolation. Hence they were not pursued for further studies. One helical peptide adopted a significant amount of structure in aqueous buffer and will be shortly conjugated to carrier proteins and used in immunization studies. In Appendix A2, we created error-prone PCR libraries and loop-randomization libraries of b12 binding site constructs and attempted to screen these for better b12 binding using phage-display. However the screening was unsuccessful as the phages showed non-specific binding to b12 antibody. These libraries will be screened in future using yeast display.
78

Enrichment of miRNA targets in REST-regulated genes allows filtering of miRNA target predictions

Gebhardt, Marie Luise 08 January 2016 (has links)
Vorhersagen von miRNA-Bindestellen enthalten oft einen hohen Prozentsatz an falsch positiven Ergebnissen (24-70%). Gleichzeitig ist es schwierig die biologischen Interaktionen von miRNAs und ihren Zieltranskripten auf experimentellem Wege und Genom weit zu messen. Daher wurde in der vorliegenden Arbeit die Frage beantwortet, ob ChIP-Sequenzierungsdaten, von denen es immer mehr gibt, verwendet werden können, um Vorhersagen von miRNA-Bindestellen zu filtern. Dabei wurde von einem Netzwerk aus miRNAs und Transkriptionsfaktoren gebraucht gemacht, die Zieltranskripte gemeinsam regulieren. Zunächst wurden verschiedene Methoden getestet, mit denen „Peaks“ aus der ChIP-Sequenzierung Zielgenen zugeordnet werden können. Zielgenlisten des transkriptionalen Repressors RE1-silencing transcription factor (REST/NRSF) wurden mithilfe von ChIP-Sequenzierungsdaten erzeugt. Ein Algorithmus zur Suche nach überrepräsentierten miRNA-Zielgenen in REST-Genlisten basierend auf Vorhersagen von TargetScanHuman wurde entwickelt und angewandt. Die detektierten „enrichment“-miRNAs waren Teil eines vielfältig regulierten REST-miRNA-Netzwerks. Mögliche Funktionen von miRNAs wurden vorgeschlagen und ihre Rolle im gemeinsamen Netzwerk mit REST und im damit gebildeten Netzwerkmotiv (Inkoherente Schleife zur Vorwärtskopplung Typ 2) wurde analysiert. Es stellte sich heraus, dass ein Filtern der Vorhersagen tatsächlich möglich ist, da Gene, die sowohl von REST als auch von einer oder mehreren „enrichment“-miRNAs reguliert werden, einen höheren Anteil an wahren miRNA-Transkript-Interaktionen haben. / Predictions of miRNA binding sites suffer from high false positive rates (24-70%) and measuring biological interactions of miRNAs and target transcripts on a genome wide scale remains challenging. In the thesis at hand the question was answered if the ever growing body of ChIP-sequencing data can be applied to filter miRNA target predictions by making use of the underlying regulatory network of miRNAs and transcription factors. First different methods for association of ChIP-sequencing peaks to target genes were tested. Target gene lists of the transcriptional repressor RE1-silencing transcription factor (REST/NRSF) were generated by means of ChIP-sequencing data. An enrichment analysis tool based on predictions from TargetScanHuman was developed and applied to find ‘enrichment’-miRNAs with over-represented targets in the REST gene lists. The detected miRNAs were shown to be part of a highly regulated REST-miRNA network. Possible functions could be assigned to them and their role in the regulatory network and special network motifs (incoherent feedforward loop of type 2) was analyzed. It turned out that miRNA target predictions of genes shared by enrichment-miRNAs and REST had a higher proportion of true positive associations than the TargetScanHuman background, thus the procedure made a filtering possible.
79

Computational and experimental approaches to regulatory genetic variation

Andersen, Malin January 2007 (has links)
Genetic variation is a strong risk factor for many human diseases, including diabetes, cancer, cardiovascular disease, depression, autoimmunity and asthma. Most of the disease genes identified so far alter the amino acid sequences of encoded proteins. However, a significant number of genetic variants affecting complex diseases may alter the regulation of gene transcription. The map of the regulatory elements in the human genome is still to a large extent unknown, and it remains a challenge to separate the functional regulatory genetic variations from linked neutral variations. The objective of this thesis was to develop methods for the identification of genetic variation with a potential to affect the transcriptional regulation of human genes, and to analyze potential regulatory polymorphisms in the CD36 glycoprotein, a candidate gene for cardiovascular disease. An in silico tool for the prediction of regulatory polymorphisms in human genes was implemented and is available at www.cisreg.ca/RAVEN. The tool was evaluated using experimentally verified regulatory single nucleotide polymorphisms (SNPs) collected from the scientific literature, and tested in combination with experimental detection of allele specific expression of target genes (allelic imbalance). Regulatory SNPs were shown to be located in evolutionary conserved regions more often than background SNPs, but predicted transcription factor binding sites were unable to enrich for regulatory SNPs unless additional information linking transcription factors with the target genes were available. The in silico tool was applied to the CD36 glycoprotein, a candidate gene for cardiovascular disease. Potential regulatory SNPs in the alternative promoters of this gene were identified and evaluated in vitro and in vivo using a clinical study for coronary artery disease. We observed association to the plasma concentrations of inflammation markers (serum amyloid A protein and C-reactive protein) in myocardial infarction patients, which highlights the need for further analyses of potential regulatory polymorphisms in this gene. Taken together, this thesis describes an in silico approach to identify putative regulatory polymorphisms which can be useful for directing limited laboratory resources to the polymorphisms most likely to have a phenotypic effect.
80

Computational Methods for Inferring Transcription Factor Binding Sites

Morozov, Vyacheslav 11 October 2012 (has links)
Position weight matrices (PWMs) have become a tool of choice for the identification of transcription factor binding sites in DNA sequences. PWMs are compiled from experimentally verified and aligned binding sequences. PWMs are then used to computationally discover novel putative binding sites for a given protein. DNA-binding proteins often show degeneracy in their binding requirement, the overall binding specificity of many proteins is unknown and remains an active area of research. Although PWMs are more reliable predictors than consensus string matching, they generally result in a high number of false positive hits. A previous study introduced a novel method to PWM training based on the known motifs to sample additional putative binding sites from a proximal promoter area. The core idea was further developed, implemented and tested in this thesis with a large scale application. Improved mono- and dinucleotide PWMs were computed for Drosophila melanogaster. The Matthews correlation coefficient was used as an optimization criterion in the PWM refinement algorithm. New PWMs keep an account of non-uniform background nucleotide distributions on the promoters and consider a larger number of new binding sites during the refinement steps. The optimization included the PWM motif length, the position on the promoter, the threshold value and the binding site location. The obtained predictions were compared for mono- and dinucleotide PWM versions with initial matrices and with conventional tools. The optimized PWMs predicted new binding sites with better accuracy than conventional PWMs.

Page generated in 0.0605 seconds