Global ETD Search

11	The Role of Affinity and Arrangement of Transcription Factor Binding Sites in Determining Hox-regulated Gene Expression Patterns Zandvakili, Arya 30 October 2018 (has links) No description available. Molecular Biology Transcriptional Regulation Hox Factors Transcription Factors Transcription Factor Binding Sites
12	In Silico Discovery of Pollen-specific Cis-regulatory Elements in the Arabidopsis Hydroxyproline-Rich Glycoprotein Gene Family Wolfe, Richard A. January 2014 (has links) No description available. Bioinformatics Computer Science Bioinformatics transcription factor transcription factor binding site motif discovery computer
13	Identification of de novo Transcription Factor Binding Motifs Created by Cancer-related Mutations Li, Siqi January 2022 (has links) In many countries, cancer is one of the biggest threats for citizens’ health, especially among aged people. Genomic mutations play a crucial role in cancer cell development. In previous decades, cancer research has been mainly focused on mutations in coding regions. These mutations can directly change the encoded protein sequences and influence their functions. In recent years, as the function of non-coding regions has been gradually understood, a growing number of studies have focused on the role of non-coding mutations in cancer. Transcription factor (TF) is an important group of gene regulatory factors. These factors only bind to specific sequences called transcription factor binding motifs (TFBMs) in the genome. Mutations in these motifs can disrupt the TF binding and thus influence gene regulation. A framework called funMotifs was made to predict and annotate functional TFBMs in the human genome. And a research has been made to intersect the mutation information from Pan-Cancer Analysis of Whole Genomes (PCAWG) to motifs in funMotifs, aiming to give a general view of influence of cancer-related mutations on functional TF motifs. But the research only focused on the existing motifs that were identified previously from the normal genome, while de novo motifs that could be potentially created by mutations were disregarded. An instance near the TERT promoter has been found, showing that mutations create a de novo ETS binding site and up-regulate the TERT expression. My study aims to extend the borderline of funMotifs, from existing motifs to de novo motifs created by cancer-related mutations. I extended the original motifs in funMotifs database and merged the overlapping motifs into longer regulatory elements. Then I mutated these elements according to the mutation data from PCAWG. Next I scan through the mutated elements and identify TF motifs. These motifs were then intersected with original motifs in funMotifs database to remove the redundant results. After intersection and filtering, 2,525,771 de novo motifs were retained. These motifs mainly come from C2H2 zinc finger factors, tryptophan cluster factors, STAT domain factors, fork head/winged helix factors, MADS box factors and homeo domain factors. Even though the de novo motifs I found in this study still need further verification and analysis, for example the change of information content in the mutated sites of the motifs, the result I obtained can be a useful data source for further research on regulatory impact from cancer-related mutations. / <p></p><p></p><p></p><p></p><p></p> transcription factor binding motifs funMotifs PCAWG mutations de novo motifs Bioinformatics and Systems Biology Bioinformatik och systembiologi
14	Incidence and Regulatory Implications of Single Nucleotide Polymorphisms among Established Ovarian Cancer Genes. Ramdayal, Kavisha. January 2009 (has links) <p>OVARIAN cancer research focuses on answering important questions related to the disease, determining whether new approaches are feasible to contribute towards improving current treatments or discovering new ones. This study focused on the transcriptional regulation of genes that have been implicated in ovarian cancer, based on the occurrences of single nucleotide polymorphisms (SNPs) within transcription factor binding sites (TFBSs). Through the application of several in silico tools, databases and custom programs, this research aimed to contribute toward the identification of potentially bio-medically important genes or SNPs for pre-diagnosis and subsequent treatment planning of ovarian cancer. A total of 379 candidate genes that have been experimentally associated with ovarian cancer were analyzed. This led to the identification of 121 SNPs that were found to coincide with putative TFBSs potentially influencing a total of 57 transcription factors that would normally bind to these TFBSs. These SNPs with potential phenotypic effect were then evaluated among several population groups, defined by the International HapMap consortium resulting in the identification of three SNPs present in five or more of the eleven population groups that have been sampled.</p> Ovarian cancer Susceptibility Single nucleotide polymorphism SNP@Promoter F-SNP PupaSuite Transcription factor binding site Transcription factor MATCHTM HapMap Allele.
15	Incidence and Regulatory Implications of Single Nucleotide Polymorphisms among Established Ovarian Cancer Genes. Ramdayal, Kavisha. January 2009 (has links) <p>OVARIAN cancer research focuses on answering important questions related to the disease, determining whether new approaches are feasible to contribute towards improving current treatments or discovering new ones. This study focused on the transcriptional regulation of genes that have been implicated in ovarian cancer, based on the occurrences of single nucleotide polymorphisms (SNPs) within transcription factor binding sites (TFBSs). Through the application of several in silico tools, databases and custom programs, this research aimed to contribute toward the identification of potentially bio-medically important genes or SNPs for pre-diagnosis and subsequent treatment planning of ovarian cancer. A total of 379 candidate genes that have been experimentally associated with ovarian cancer were analyzed. This led to the identification of 121 SNPs that were found to coincide with putative TFBSs potentially influencing a total of 57 transcription factors that would normally bind to these TFBSs. These SNPs with potential phenotypic effect were then evaluated among several population groups, defined by the International HapMap consortium resulting in the identification of three SNPs present in five or more of the eleven population groups that have been sampled.</p> Ovarian cancer Susceptibility Single nucleotide polymorphism SNP@Promoter F-SNP PupaSuite Transcription factor binding site Transcription factor MATCHTM HapMap Allele.
16	Incidence and regulatory implications of single Nucleotide polymorphisms among established ovarian cancer genes Ramdayal, Kavisha January 2009 (has links) Magister Scientiae - MSc / OVARIAN cancer research focuses on answering important questions related to the disease, determining whether new approaches are feasible to contribute towards improving current treatments or discovering new ones. This study focused on the transcriptional regulation of genes that have been implicated in ovarian cancer, based on the occurrences of single nucleotide polymorphisms (SNPs) within transcription factor binding sites (TFBSs). Through the application of several in silico tools, databases and custom programs, this research aimed to contribute toward the identification of potentially bio-medically important genes or SNPs for pre-diagnosis and subsequent treatment planning of ovarian cancer. A total of 379 candidate genes that have been experimentally associated with ovarian cancer were analyzed. This led to the identification of 121 SNPs that were found to coincide with putative TFBSs potentially influencing a total of 57 transcription factors that would normally bind to these TFBSs. These SNPs with potential phenotypic effect were then evaluated among several population groups, defined by the International HapMap consortium resulting in the identification of three SNPs present in five or more of the eleven population groups that have been sampled. / South Africa Ovarian cancer Susceptibility Single nucleotide Polymorphism SNP@Promoter F-SNP PupaSuite Transcription factor binding site Transcription factor MATCHTM HapMap Allele
17	In silico investigation of glossina morsitans promoters Mwangi, Sarah Wambui January 2013 (has links) Philosophiae Doctor - PhD / Tsetse flies (Glossina spp) are the biological vectors for Trypanosomes, the causative magents of Human African Trypanosomiasis (HAT). HAT is a debilitating disease that continues to present a major public health problem and a key factor limiting rural development in vast regions of tropical Africa. To augment vector control efforts, the International Glossina Genome Initiative (IGGI) was established in 2004 with the ultimate goal of generating a fully annotated whole genome sequence for Glossina morsitans. A working draft genome of Glossina morsitans was availed in 2011. In this thesis, transcriptional regulatory features in Glossina morsitans were analysed using the draft genome. A method for TSS identification in the newly sequenced Glossina morsitans genome was developed using TSS-seq tags sampled from two developmental stages of Glossina morsitans. High throughput next generation sequencing reads obtained from Glossina morsitans larvae and pupae were used to locate transcription start sites (TSS) in the Glossina morsitans genome. TSS-seq tag clusters, defined as a minimum number of reads at the 5’ predicted UTR or first coding exon, were used to define transcription start sites. A total of 3134 tag clusters were identified on the Glossina genome. Approximately 45.4% (1424) of the tag clusters mapped to the first coding exons or their proximal predicted 5’UTR regions and include 31 tag clusters that mapped to transposons. A total of 1101 (35.1%) tag clusters mapped outside the genic region and/or scaffolds without gene predictions and may correspond to previously un-annotated transcripts or noncoding RNA TSS. The core promoter regions were classified as narrow or broad based on the number of TSS positions within a TSS-seq cluster. Majority (95%) of the core promoters analysed in this study were of the broad type while only 5% were of the narrow type. Comparison of canonical core promoter motif occurences between random and bona fide core promoters showed that, generally, the number of motifs in biologically functional genomic windows in the true dataset exceeded those in the random dataset (p <= 0.00164, 0.00135, 0.00185 for the narrow, broad with peak and broad without peak categories respectively). Frequency of motif co-occurrence in core promoter was found to be fundamentally different across various initiation patterns. Narrow core promoters recorded higher frequency of the TATA-box and INR motifs and two-way motif co-occurrence showed that the TATA-box-INR pair is over-represented in the narrow category. Broad core promoters showed higher frequency of the BREd and MTE motifs and two-way motif co-occurrence showed that the MTE-DPE pair is over-represented in broad core promoters. TATA-less promoters account for 77% of the core promoters in this analysis. TATA-less core promoters showed a higher frequency of the MTE and INR motifs in contrast to observations in Drosophila where the DPE motif has been reported to occur frequently in TATA-less promoters. These motif combinations suggest their equal importance to transcription in their corresponding promoter classes in Glossina morsitans. Glossina morsitans Human African trypanosomiasis Genome TSS-seq Transcription Transcription start site Promoter Transcription regulation Transcription factor binding site Database
18	A bioinformatics approach to the study of the transcriptional regulation of AMPA glutamate receptors (GRIAs) and genes whose expression are co-regulated with GRIAs Chong, Allen K.S. January 2009 (has links) Philosophiae Doctor - PhD / It was postulated that each gene has three main sets of transcriptional elements: one which is gene-specific, one which is family-specific, and a third which is tissue-specific.The starting hypothesis for this project had been: “Each family of genes has a distinct set of transcriptional elements that is unique onto this family”. The primary aim of this project was therefore the identification of the family-specific set of transcriptional elements within the AMPA receptor gene family. The question then is how does one measure or identify this uniqueness within the promoters of this family of genes. The answer seemed to lie in making an assessment of the promoters of this family of genes against a background of a comprehensive set of promoter sequences and in the process,to try to find the transcriptional elements that were present in the AMPA receptor gene promoters but were not so common in the general population of gene promoters.To achieve the primary aim of this project, it was essential that a comprehensive dataset of promoter sequences was available. There are ample data freely available through the web. However, it is often not available in a form that we might want it in. Another problem that one constantly encounters is the lack of general consensus among the research community in agreeing on a standard annotation. For example, a gene can sometimes be given 2 or 3 different names by different laboratories which have successfully cloned the same gene. This, in turn, hinders the data collection process. At the start of this project, there was an existing curated database of experimentally-verified eukaryotic promoter sequences called the Eukaryotic Promoter Database (EPD) and a software called Promoter Extraction from GenBank (PEG) which, as its name implies, extracts promoter sequences available through GenBank (Cavin Périer et al., 1998;Zhang & Zhang, 2001; Praz et al., 2002; Schmid et al., 2004). However, limitations existed in both these resources. For EPD, the number of curated promoter sequences available was low and also, the length of these promoter sequences was short. For PEG,the main limitation was that the extraction from GenBank would result in extraction of sequences of variable lengths.Therefore, the 5’-end Information Extraction (FIE)system was developed for the expressed purpose of collecting promoter sequences without the limitations of PEG. This software relies on the alignment of multiple mRNA/cDNA sequences that are representative of a gene on the human genomic sequence to determine the transcription start site (TSS) of the gene and thus, with this information, extract the promoter sequence for the gene from the available human genomic sequence. This was the first promoter extraction software to work on this principle (Chong et al., 2002). This method was later supported by experimental work carried out by Coleman and colleagues (2002). Using the FIE2 software (Chong et al.,2003), some 10,000-odd human promoter sequences was extracted, starting at 1500bp uptream and ending at 1000bp downstream of the 5’-most TSS.Following the collection of the human promoter sequences, the approach developed by Bajic et al. (2004) was applied to study the promoters of the AMPA receptor genes. This approach relies on both the MATCH program to map putative transcription factor binding sites (TFBSs) to the promoter sequences and a software developed by Bajic etal. (2004) that calculates to the density for each TFBS or composite element. Having calculated the densities for the TFBSs and composite elements for both the target promoters (in this case, the AMPA receptor gene promoters) and the background promoters (the 10,000-odd human promoters), the software then calculates the degree of over-representation of each TFBS and composite element in the target promoters(measured against the background promoters) and then ranks the “singles”, “pairs” and “triplets” in the order of their degree of over-representation. Using this method, I identified the top 3 ranked “single”, “pair” and “triplet” transcriptional elements found commonly within the AMPA receptor promoters. In addition, a conventional phylogenetic footprinting study was also carried out for the human, mouse and rat GRIA1 promoter to identify key transcriptional elements within this subunit’s promoter.While the approach developed by Bajic et al. (2004) identifies key family-specific transcriptional elements, the phylogenetic footprinting study helps identify key genespecific transcriptional elements. Thus, they complement one another.The approach developed by Bajic et al. (2004) yielded an interesting result. It was found that the combination of the top 3 ranked “single”, “pair” and “triplet” transcriptional elements found in the AMPA receptor promoters were also found in 47 other genes. It was postulated that these 47 genes might, in fact, be co-regulated / co-expressed with the GRIAs and thus, explaining the existence of a shared promoter profile with the GRIA promoters. In support of this hypothesis, supporting evidence was found in published literature that 7 of these 47 genes (VAMP4, Rab3B, FKBP8, 3-OST-3A, CLSTN3,SOCS1 and IκBβ) might indeed be involved in the expression and functioning of the AMPA receptors. Neuroscience Bioinformatics Transcription Promoter Transcription factor binding site Composite element Gene expression AMPA glutamate receptor Neurotransmitter Coexpression
19	Combining Prior Information for the Prediction of Transcription Factor Binding Sites Benner, Philipp 21 June 2018 (has links) Despite the fact that each cell in an organism has the same genetic information, it is possible that cells fundamentally differ in their function. The molecular basis for the functional diversity of cells is governed by biochemical processes that regulate the expression of genes. Key to this regulatory process are proteins called transcription factors that recognize and bind specific DNA sequences of a few nucleotides. Here we tackle the problem of identifying the binding sites of a given transcription factor. The prediction of binding preferences from the structure of a transcription factor is still an unsolved problem. For that reason, binding sites are commonly identified by searching for overrepresented sites in a given collection of nucleotide sequences. Such sequences might be known regulatory regions of genes that are assumed to be coregulated, or they are obtained from so-called ChIP-seq experiments that identify approximately the sites that were bound by a given transcription factor. In both cases, the observed nucleotide sequences are much longer than the actual binding sites and computational tools are required to uncover the actual binding preferences of a factor. Aggravated by the fact that transcription factors recognize not only a single nucleotide sequence, the search for overrepresented patterns in a given collection of sequences has proven to be a challenging problem. Most computational methods merely relied on the given set of sequences, but additional information is required in order to make reliable predictions. Here, this information is obtained by looking at the evolution of nucleotide sequences. For that reason, each nucleotide sequence in the observed data is augmented by its orthologs, i.e. sequences from related species where the same transcription factor is present. By constructing multiple sequence alignments of the orthologous sequences it is possible to identify functional regions that are under selective pressure and therefore appear more conserved than others. The processing of the additional information exerted by ortholog sequences relies on a phylogenetic tree equipped with a nucleotide substitution model that not only carries information about the ancestry, but also about the expected similarity of functional sites. As a result, a Bayesian method for the identification of transcription factor binding sites is presented. The method relies on a phylogenetic tree that agrees with the assumptions of the nucleotide substitution process. Therefore, the problem of estimating phylogenetic trees is discussed first. The computation of point estimates relies on recent developments in Hadamard spaces. Second, the statistical model is presented that captures the enrichment and conservation of binding sites and other functional regions in the observed data. The performance of the method is evaluated on ChIP-seq data of transcription factors, where the binding preferences have been estimated in previous studies. info:eu-repo/classification/ddc/500 ddc:500
20	Improving computational predictions of Cis-regulatory binding sites in genomic data Rezwan, Faisal Ibne January 2011 (has links) Cis-regulatory elements are the short regions of DNA to which specific regulatory proteins bind and these interactions subsequently influence the level of transcription for associated genes, by inhibiting or enhancing the transcription process. It is known that much of the genetic change underlying morphological evolution takes place in these regions, rather than in the coding regions of genes. Identifying these sites in a genome is a non-trivial problem. Experimental (wet-lab) methods for finding binding sites exist, but all have some limitations regarding their applicability, accuracy, availability or cost. On the other hand computational methods for predicting the position of binding sites are less expensive and faster. Unfortunately, however, these algorithms perform rather poorly, some missing most binding sites and others over-predicting their presence. The aim of this thesis is to develop and improve computational approaches for the prediction of transcription factor binding sites (TFBSs) by integrating the results of computational algorithms and other sources of complementary biological evidence. Previous related work involved the use of machine learning algorithms for integrating predictions of TFBSs, with particular emphasis on the use of the Support Vector Machine (SVM). This thesis has built upon, extended and considerably improved this earlier work. Data from two organisms was used here. Firstly the relatively simple genome of yeast was used. In yeast, the binding sites are fairly well characterised and they are normally located near the genes that they regulate. The techniques used on the yeast genome were also tested on the more complex genome of the mouse. It is known that the regulatory mechanisms of the eukaryotic species, mouse, is considerably more complex and it was therefore interesting to investigate the techniques described here on such an organism. The initial results were however not particularly encouraging: although a small improvement on the base algorithms could be obtained, the predictions were still of low quality. This was the case for both the yeast and mouse genomes. However, when the negatively labeled vectors in the training set were changed, a substantial improvement in performance was observed. The first change was to choose regions in the mouse genome a long way (distal) from a gene over 4000 base pairs away - as regions not containing binding sites. This produced a major improvement in performance. The second change was simply to use randomised training vectors, which contained no meaningful biological information, as the negative class. This gave some improvement over the yeast genome, but had a very substantial benefit for the mouse data, considerably improving on the aforementioned distal negative training data. In fact the resulting classifier was finding over 80% of the binding sites in the test set and moreover 80% of the predictions were correct. The final experiment used an updated version of the yeast dataset, using more state of the art algorithms and more recent TFBSs annotation data. Here it was found that using randomised or distal negative examples once again gave very good results, comparable to the results obtained on the mouse genome. Another source of negative data was tried for this yeast data, namely using vectors taken from intronic regions. Interestingly this gave the best results. 572.072

Search results