Thesis (Ph.D.)--Boston University / PLEASE NOTE: Boston University Libraries did not receive an Authorization To Manage form for this thesis or dissertation. It is therefore not openly accessible, though it may be available by request. If you are the author or principal advisor of this work and would like to request open access for it, please contact us at open-help@bu.edu. Thank you. / DNA sequencing techniques have evolved to the point where one can sequence millions of bases per minute, while our capacity to use this information has been left behind. One particularly notorious example is in the area of gene regulatory networks. A molecular study of gene regulation proceeds one protein at a time, requiring bench scientists months of work purifying transcription factors and performing DNA footprinting studies. Massive scale options like ChIP-Seq and microarrays are a step up, but still require considerable resources in terms of manpower and materials. While computational biologists have developed methods to predict protein function from sequence, gene locations from sequence, and even metabolic networks from sequence, the space of regulatory network reconstruction from sequence remains virtually untouched. Part of the reason comes from the fact that the components of a regulatory interaction, such as transcription factors and binding sites, are difficult to detect. The other, more prominent reason, is that there exists no "recognition code" to determine which transcription factors will bind which sites. I've created a pipeline to reconstruct regulatory networks starting from an unannotated complete genomic sequence for a prokaryotic organism. The pipeline predicts necessary information, such as gene locations and transcription factor sequences, using custom tools and third party software. The core step is to determine the likelihood of interaction between a TF and a binding site using a black box style recognition code developed by applying machine learning methods to databases of prokaryotic regulatory interactions. I show how one can use this pipeline to reconstruct the virtually unknown regulatory network of Bacillus anthracis. / 2031-01-01
Identifer | oai:union.ndltd.org:bu.edu/oai:open.bu.edu:2144/32880 |
Date | January 2012 |
Creators | Fichtenholtz, Alexander Michael |
Publisher | Boston University |
Source Sets | Boston University |
Language | en_US |
Detected Language | English |
Type | Thesis/Dissertation |
Page generated in 0.0023 seconds