41 |
Elucidating the Mechanisms of Transposable Elements using Experimental and Bioinformatic Approaches: The hAT Superfamily of Transposable Elements in the Genome of Aedes aegypti and TE DisplayerRooke, Rebecca 19 December 2011 (has links)
Transposable elements (TEs) are found in nearly all eukaryotic genomes and are a major driving force of genome evolution. The hAT superfamily of TEs are found in a variety of organisms, including plants, fungi, insects and animals. To date, only 14 hAT TEs in the Aedes aegypti genome have been annotated as having a hAT transposase coding sequence. In this study, extensive bioinformatic approaches have been employed to find hAT TEs that encode transposases in the A. aegypti genome. A total of six newly-identified TEs belonging to the hAT superfamily were discovered in the A. aegypti genome. Furthermore, a computer program called TE Displayer was developed to analyze TEs in genome sequences. TE Displayer detects TE-derived polymorphisms in genome datasets and presents the results on a virtual gel image. TE Displayer enables researchers to compare TE profiles in silico and provides a reference profile for experimental analyses.
|
42 |
High Throughput Prediction of Critical Protein Regions Using Correlated Mutation AnalysisXu, Yongbai 29 July 2010 (has links)
Correlated mutation analysis is an effective approach for predicting functional and structural residue interactions from protein multiple sequence alignments. A prediction pipeline over the Pfam database was developed to predict residue contacts within protein domains. Cross- reference with the PDB showed these contacts are spatially close. Furthermore, we found our predictions to be biochemically reasonable and correspond closely with known contact matrices. This large-scale search for coevolving regions within protein domains revealed that if two sites in an alignment covary, then neighboring sites in the alignment would also typically covary, resulting in clusters of covarying residues. The program PatchD was developed to measure the covariation between disconnected sequence clusters to reveal patch covariation. Patches that exhibited strong covariation identified multiple residues that were generally nearby in the protein structures, suggesting that the detection of covarying patches can be used in addition to traditional CMA approaches to reveal functional interaction partners.
|
43 |
Mechanism of MicroRNA miR-520g Pathogenesis in CNS-PNETShih, J. H. David 25 August 2011 (has links)
We recently discovered a high-level amplicon spanning the chr19q13.41 microRNA cluster in CNS Primitive Neuroectodermal Tumour, which results in striking upregulation of miR-520g. Constitutive over-expression of miR-520g in untransformed human neural stem cells enhanced cell growth, restricted differentiation down the neuronal lineage, and promoted expression of neural stem/progenitor cell markers. We thus hypothesize that ectopic miR-520g expression promotes tumourigenesis in part by inhibiting cellular differentiation. Consistent with this proposition, miR-520g is silenced upon embryonic stem cell differentiation and its expression is absent from most adult tissues. Moreover, expression analysis of miR-520g overexpressing cells revealed significant dysregulation of developmental signalling pathways. Further efforts focused on elucidating mechanisms of miR-520g function led to the identification of a cell cycle inhibitor, p21, as an important candidate target. These findings collectively suggest that miR-520g may modulate differentiation by regulating developmental signalling pathways and cell cycle exit of neural stem/progenitor cells.
|
44 |
Identifying Tissue Specific Distal Regulatory Sequences in the Mouse GenomeChen, Chih-yu 06 December 2011 (has links)
Epigenetic modifications, transcription factor (TF) availability and chromatin conformation influence how a genome is interpreted by the transcriptional machinery responsible for gene expression. Enhancers buried in non-coding regions are associated with significant differences in histone marks between different cell types. In contrast, gene promoters show more uniform modifications across cell types. In this report, enhancer identification is first carried out using an enhancer associated feature in mouse erythroid cells. Taking advantage of public domain ChIP-Seq data sets in mouse embryonic stem cells, an integrative model is then used to assess features in enhancer prediction, and subsequently locate enhancers. Significant associations with multiple TF bound loci, higher expression in the closest genes, and active enhancer marks support functionality and tissue-specificity of these enhancers. Motif enrichment analysis further determines known and novel TFs regulating the target cell type. Furthermore, the features identified can facilitate more accurate enhancer prediction in other cell types.
|
45 |
Elucidating the Mechanisms of Transposable Elements using Experimental and Bioinformatic Approaches: The hAT Superfamily of Transposable Elements in the Genome of Aedes aegypti and TE DisplayerRooke, Rebecca 19 December 2011 (has links)
Transposable elements (TEs) are found in nearly all eukaryotic genomes and are a major driving force of genome evolution. The hAT superfamily of TEs are found in a variety of organisms, including plants, fungi, insects and animals. To date, only 14 hAT TEs in the Aedes aegypti genome have been annotated as having a hAT transposase coding sequence. In this study, extensive bioinformatic approaches have been employed to find hAT TEs that encode transposases in the A. aegypti genome. A total of six newly-identified TEs belonging to the hAT superfamily were discovered in the A. aegypti genome. Furthermore, a computer program called TE Displayer was developed to analyze TEs in genome sequences. TE Displayer detects TE-derived polymorphisms in genome datasets and presents the results on a virtual gel image. TE Displayer enables researchers to compare TE profiles in silico and provides a reference profile for experimental analyses.
|
46 |
Bayesian Hidden Markov Models for finding DNA Copy Number Changes from SNP Genotyping ArraysKowgier, Matthew 31 August 2012 (has links)
DNA copy number variations (CNVs), which involve the deletion or duplication of subchromosomal segments of the genome, have become a focus of genetics research. This dissertation develops Bayesian HMMs for finding CNVs from single nucleotide polymorphism (SNP) arrays.
A Bayesian framework to reconstruct the DNA copy number sequence from the observed sequence of SNP array measurements is proposed. A Markov chain Monte Carlo (MCMC) algorithm, with a forward-backward stochastic algorithm for sampling DNA copy number sequences, is developed for estimating model parameters. Numerous versions of Bayesian HMMs are explored, including a discrete-time model and different models for the instantaneous transition rates of change among copy number states of a continuous-time HMM. The most general model proposed makes no restrictions and assumes the rate of transition depends on the current state, whereas the nested model fixes some of these rates by assuming that the rate of transition is independent of the current state. Each model is assessed using a subset of the HapMap data. More general parameterizations of the transition intensity matrix of the continuous-time Markov process produced more accurate
inference with respect to the length of CNV regions. The observed SNP array measurements are assumed to be stochastic with distribution determined by the underlying DNA copy number. Copy-number-specific distributions, including a non-symmetric
distribution for the 0-copy state (homozygous deletions) and mixture distributions for 2-copy state (normal), are developed and shown to be more appropriate than existing implementations which lead
to biologically implausible results.
Compared to existing HMMs for SNP array data, this approach is more flexible in that model parameters are estimated from the data rather than set to a priori values. Measures of uncertainty, computed as simulation-based probabilities, can be determined for putative CNVs detected by the HMM. Finally,
the dissertation concludes with a discussion of future work, with special attention given to model extensions for multiple sample analysis and family trio data.
|
47 |
Computational Prediction of PDZ Mediated Protein-protein InteractionsHui, Shirley 09 January 2014 (has links)
Many protein-protein interactions, especially those involved in eukaryotic signalling, are mediated by PDZ domains through the recognition of hydrophobic C-termini. The availability of experimental PDZ interaction data sets have led to the construction of computational methods to predict PDZ domain-peptide interactions. Such predictors are ideally suited to predict interactions in single organisms or for limited subsets of PDZ domains. As a result, the goal of my thesis has been to build general predictors that can be used to scan the proteomes of multiple organisms for ligands for almost all PDZ domains from select model organisms. A framework consisting of four steps: data collection, feature encoding, predictor training and evaluation was developed and applied for all predictors built in this thesis.
The first predictor utilized PDZ domain-peptide sequence information from two interaction data sets obtained from high throughput protein microarray and phage display experiments in mouse and human, respectively. The second predictor used PDZ domain structure and peptide sequence information. I showed that these predictors are complementary to each other, are capable of predicting unseen interactions and can be used for the purposes of proteome scanning in human, worm and fly. As both positive and negative interactions are required for building a successful predictor, a major obstacle I addressed was the generation of artificial negative interactions for training. In particular, I used position weight matrices to generate such negatives for the positive only phage display data and used a semi-supervised learning approach to overcome the problem of over-prediction (i.e. prediction of too many positives). These predictors are available as a community web resource: http://webservice.baderlab.org/domains/POW. Finally, a Bayesian integration method combining information from different biological evidence sources was used to filter the human proteome scanning predictions from both predictors. This resulted in the construction of a comprehensive physiologically relevant high confidence PDZ mediated protein-protein interaction network in human.
|
48 |
Computational Prediction of PDZ Mediated Protein-protein InteractionsHui, Shirley 09 January 2014 (has links)
Many protein-protein interactions, especially those involved in eukaryotic signalling, are mediated by PDZ domains through the recognition of hydrophobic C-termini. The availability of experimental PDZ interaction data sets have led to the construction of computational methods to predict PDZ domain-peptide interactions. Such predictors are ideally suited to predict interactions in single organisms or for limited subsets of PDZ domains. As a result, the goal of my thesis has been to build general predictors that can be used to scan the proteomes of multiple organisms for ligands for almost all PDZ domains from select model organisms. A framework consisting of four steps: data collection, feature encoding, predictor training and evaluation was developed and applied for all predictors built in this thesis.
The first predictor utilized PDZ domain-peptide sequence information from two interaction data sets obtained from high throughput protein microarray and phage display experiments in mouse and human, respectively. The second predictor used PDZ domain structure and peptide sequence information. I showed that these predictors are complementary to each other, are capable of predicting unseen interactions and can be used for the purposes of proteome scanning in human, worm and fly. As both positive and negative interactions are required for building a successful predictor, a major obstacle I addressed was the generation of artificial negative interactions for training. In particular, I used position weight matrices to generate such negatives for the positive only phage display data and used a semi-supervised learning approach to overcome the problem of over-prediction (i.e. prediction of too many positives). These predictors are available as a community web resource: http://webservice.baderlab.org/domains/POW. Finally, a Bayesian integration method combining information from different biological evidence sources was used to filter the human proteome scanning predictions from both predictors. This resulted in the construction of a comprehensive physiologically relevant high confidence PDZ mediated protein-protein interaction network in human.
|
49 |
Transcriptional Network Analysis During Early Differentiation Reveals a Role for Polycomb-like 2 in Mouse Embryonic Stem Cell CommitmentWalker, Emily 11 January 2012 (has links)
We used mouse embryonic stem cells (ESCs) as a model to study the mechanisms that regulate stem cell fate. Using gene expression analysis during a time course of differentiation, we identified 281 candidate regulators of ESC fate. To integrate these candidate regulators into the known ESC transcriptional network, we incorporated promoter occupancy data for OCT4, NANOG and SOX2. We used shRNA knockdown studies followed by a high-content fluorescence imaging assay to test the requirement of our predicted regulators in maintaining self-renewal. We further integrated promoter occupancy data for Polycomb group (PcG) proteins, EED and PHC1 to identify 43 transcriptional networks in which we predict that OCT4 and NANOG co-operate with EED and PHC1 to influence the expression of multiple developmental regulators. Next, we turned our focus to the PcG protein PCL2 which we identified as being bound by both OCT4 and NANOG and down-regulated during differentiation. PcG proteins are conserved epigenetic transcriptional repressors that control numerous developmental gene expression programs. Using multiple biochemical strategies, we demonstrated that PCL2 associates with Polycomb Repressive Complex 2 (PRC2) in mouse ESCs, a complex that exerts its effect on gene expression through H3K27me3. Although PCL2 was not required for global histone methylation, it was required at specific target regions to maintain proper levels of H3K27me3. Knockdown of Pcl2 in ESCs resulted in heightened self-renewal characteristics and defects in differentiation. Integration of global gene expression and promoter occupancy analyses allowed us to identify PCL2 and PRC2 transcriptional targets and draft regulatory networks. We describe the role of PCL2 in both modulating transcription of ESC self-renewal genes in undifferentiated ESCs as well as developmental regulators during early commitment and differentiation.
|
50 |
Transcriptional Network Analysis During Early Differentiation Reveals a Role for Polycomb-like 2 in Mouse Embryonic Stem Cell CommitmentWalker, Emily 11 January 2012 (has links)
We used mouse embryonic stem cells (ESCs) as a model to study the mechanisms that regulate stem cell fate. Using gene expression analysis during a time course of differentiation, we identified 281 candidate regulators of ESC fate. To integrate these candidate regulators into the known ESC transcriptional network, we incorporated promoter occupancy data for OCT4, NANOG and SOX2. We used shRNA knockdown studies followed by a high-content fluorescence imaging assay to test the requirement of our predicted regulators in maintaining self-renewal. We further integrated promoter occupancy data for Polycomb group (PcG) proteins, EED and PHC1 to identify 43 transcriptional networks in which we predict that OCT4 and NANOG co-operate with EED and PHC1 to influence the expression of multiple developmental regulators. Next, we turned our focus to the PcG protein PCL2 which we identified as being bound by both OCT4 and NANOG and down-regulated during differentiation. PcG proteins are conserved epigenetic transcriptional repressors that control numerous developmental gene expression programs. Using multiple biochemical strategies, we demonstrated that PCL2 associates with Polycomb Repressive Complex 2 (PRC2) in mouse ESCs, a complex that exerts its effect on gene expression through H3K27me3. Although PCL2 was not required for global histone methylation, it was required at specific target regions to maintain proper levels of H3K27me3. Knockdown of Pcl2 in ESCs resulted in heightened self-renewal characteristics and defects in differentiation. Integration of global gene expression and promoter occupancy analyses allowed us to identify PCL2 and PRC2 transcriptional targets and draft regulatory networks. We describe the role of PCL2 in both modulating transcription of ESC self-renewal genes in undifferentiated ESCs as well as developmental regulators during early commitment and differentiation.
|
Page generated in 0.025 seconds