1 |
Measuring Gene Expression With Next Generation Sequencing TechnologyBusby, Michele Anne January 2012 (has links)
Thesis advisor: Gabor Marth / While a PhD student in Dr. Gabor Marth's laboratory, I have had primary responsibility for two projects focused on using RNA-Seq to measure differential gene expression. In the first project we used RNA-Seq to identify differentially expressed genes in four yeast species and I analyzed the findings in terms of the evolution of gene expression. In this experiment, gene expression was measured using two biological replicates of each species of yeast. While we had several interesting biological findings, during the analysis we dealt with several statistical issues that were caused by the experiment's low number of replicates. The cost of sequencing has decreased rapidly since this experiment's design and many of these statistical issues can now practically be avoided by sequencing a greater number of samples. However, there is little guidance in the literature as to how to intelligently design an RNA-Seq experiment in terms of the number of replicates that are required and how deeply each replicate must be sequenced. My second project, therefore, was to develop Scotty, a web-based program that allows users to perform power analysis for RNA-Seq experiments. The yeast project resulted in a highly accessed first author publication in BMC Genomics in 2011. I have structured my dissertation as follows: The first chapter, entitled General Issues in RNA-Seq, is intended to synthesize the themes and issues of RNA-Seq statistical analysis that were common to both papers. In this section, I have discussed the main findings from the two papers as they relate to analyzing RNA-Seq data. Like the Scotty application, this section is designed to be "used" by wet-lab biologists who have a limited background in statistics. While some background in statistics would be required to fully understand the following chapters, the essence of this background can be gained by reading this first chapter. The second and third chapters contain the two papers that resulted from the two RNA-Seq projects. Each chapter contains both the original manuscript and original supplementary methods and data section. Finally, I include brief summaries of my contributions to the two papers on which I was a middle author. The first was a functional analysis of the genomic regions affected by mobile element insertions as a part of Chip Stewart's paper with the 1000 Genome Consortium. This paper was published in Plos Genetics. The second was a cluster analysis of microarray gene expression in Toxoplasma gondii, which was included as part of Alexander Lorestani et al.'s paper, Targeted proteomic dissection of Toxoplasma cytoskeleton sub-compartments using MORN1. This paper is currently under review. The yeast project was a collaborative effort between Jesse Gray, Michael Springer, and Allen Costa at Harvard Medical School, Jeffery Chuang here at Boston College, and members of the Marth lab. Jesse Gray conceived of the project. While I wrote the draft for the manuscript, many people, particularly Gabor Marth, provided substantial guidance on the actual text. I conceived of and implemented Scotty and wrote its manuscript with only editorial assistance from my co-authors. I produced all figures for the two manuscripts. Chip Stewart provided extensive guidance and mentorship to me on all aspects of my statistical analyses for both projects. / Thesis (PhD) — Boston College, 2012. / Submitted to: Boston College. Graduate School of Arts and Sciences. / Discipline: Biology.
|
2 |
Evolução da expressão gênica em Calliphoridae (Diptera, Calyptratae): um modelo para o estudo do hábito de parasitismo / Gene expression evolution in Calliphoridae (Diptera, Calyptratae): a model to study feeding habitsCardoso, Gisele Antoniazzi 08 March 2019 (has links)
Espécies muito próximas da família Calliphoridae apresentam hábitos alimentares muito distintos, como alimentação em tecido de um hospedeiro vivo (parasitismo obrigatório) e alimentação em matéria orgânica em decomposição (necro-saprofagia). As origens evolu-tivas do hábito de parasitismo nesta família ainda são desconhecidas. No entanto, o que a torna ideal para o estudo da evolução do hábito alimentar é o aparecimento do parasitis-mo obrigatório em pelo menos três ocasiões independentes em sua história. Neste traba-lho, foram utilizados métodos para entender a evolução do parasitismo obrigatório assim como os genes envolvidos em três diferentes hábitos alimentares. O primeiro passo foi a inferência do hábito ancestral de Calliphoridae. Com o mapeamento de caracteres na filo-genia da família, o hábito ancestral mais provável seria a necro-saprofagia e o parasitismo obrigatório teria surgido posteriormente (independentemente do parasitismo facultati-vo). Ensaios de escolha com fêmeas permitiram a classificação precisa das espécies quan-to ao hábito alimentar. Ensaios com as larvas mostraram que tanto espécies necro-saprófagas como parasitas facultativas se alimentam tanto de carne em decomposição como carne fresca. Por outro lado, a espécie parasita obrigatória Co. hominivorax, apresentou um comportamento aversivo pela carne em decomposição e se desenvolveu somente na carne fresca. Esses resultados permitiram a formulação da hipótese de que o parasitismo tenha surgido a partir de uma mudança da atração das fêmeas pelos substra-tos para oviposição, seguida da especialização da larva parasita. A busca dos genes envol-vidos nos diferentes hábitos foi realizada por meio da análise de expressão gênica dife-rencial em dados de RNA-seq gerados para seis califorídeos. No total, foram encontrados 230 potenciais genes candidatos para investigação futura. Além disso, o padrão geral observado indicou que variações tanto nas regiões regulatórias como codificadoras, sofrem a ação predominante de seleção purificadora / Closely related species of the Calliphoridae family have contrasting feeding habits, such as feeding on living tissues of a host (obligate parasitism) and feeding on decaying organ-ic matter (necro-saprophagy). The evolutionary origins of parasitism in Calliphoridae are still unknown. However, what makes this family ideal for the study of the evolution of feeding habits is the appearance of obligate parasitism in at least three independent oc-casions. In this study, we used methods to understand the evolution of parasitism, as well as the genes involved in in three different feeding habits. First, we inferred the ancestral habit of Calliphoridae. By using stochastic character mapping along the phylogeny of the family, the most likely ancestral habit was revealed as necro-saprophagy. Obligate para-sitism could have evolved later (with an independent evolution of the facultative parasit-ism). Two-choice essays with females allowed the precise classification of the species regarding their feeding habits. Essays with larvae showed that both necro-saprophagous and facultative parasites feed on decaying flesh and fresh meat. On the other hand, the obligate parasite, Co. hominivorax, showed an aversive behavior to decaying meat and able to develop only in fresh meat. These results led to the formulation of the hy-pothesis that parasitism arose from a shift in the attraction of the female attraction to new oviposition sites, followed by the specialization of the parasitic larvae. The search for the genes involved in the different feeding habits was performed through an analysis of differential gene expression using RNA-seq data generated for six califorids. Within a da-taset containing more than 1000 candidate genes, 230 genes potential candidate genes were found for future research. In addition, the general pattern observed indicated that both regulatory and coding regions have predominantly undergone the action of purifying selection
|
3 |
Identificação e anotação funcional de novos transcritos com expressão alterada no câncer pancreático / Identification and functional annotation of novel transcripts with altered expression in pancreatic cancerSosa, Omar Julio 27 February 2019 (has links)
Neste estudo foi implementado um pipeline bioinformático para processar e analisar dados de RNA-Seq total e fita-específico gerados em nosso laboratório a partir de amostras pareadas de tumor e tecido adjacente não tumoral de 14 pacientes com o objetivo de catalogar com alta-resolução a composição e alterações no transcritoma no PDAC incluindo genes codificadores e não codificadores de proteína. / In the present work, we applied a bioinformatic pipeline to process and analyse data from total RNA-seq strand-oriented generated in our laboratory from matched samples of tumor and non-tumor adjacent pancreatic tissue from 14 patients with the goal of generate a high resolution catalog of the composition and the alterations in the transcriptome of PDAC, including protein coding and non coding genes.
|
4 |
Identificação de Alterações na Expressão de Pseudogenes e seus Genes Parentais correspondentes em Adenocarcinoma PulmonarLapa, Rainer M.L. January 2019 (has links)
Orientador: Patricia Pintor dos Reis / Resumo: Introdução: O adenocarcinoma é o subtipo histológico mais comum de câncer de pulmão e leva à óbito milhões de pacientes a cada ano, mundialmente. Biomarcadores com utilidade clínica potencial têm sido identificados; entre estes, os RNAs não codificadores e os pseudogenes apresentam um potente papel na regulação de genes-alvo e genes parentais regulados por mRNAs respectivamente, os quais estão associados a vias moleculares de tumorigênese. Objetivos: Identificar alterações na expressão de pseudogenes em adenocarcinoma pulmonar, utilizando dados de transcritoma (RNA-Seq). Material e Métodos: Este estudo incluiu 27 tumores de adenocarcinoma pulmonar e 10 tecidos pulmonares histologicamente normais, adjacentes ao tumor, dos mesmos pacientes. Dados de RNA-Seq foram gerados na plataforma Illumina HiScan SQ e utilizados para a aplicação de uma estratégia de análise a fim de identificar sequências de pseudogenes com expressão anormal em tumores. Os pseudogenes com expressão significativamente alterada (p<0,05) foram validados no banco de dados The Cancer Genome Atlas (TCGA) e utilizados para identificação funcional, in silico, utilizando métodos computacionais incluindo o IID (Integrated Interactions Database), TOPPGENES (functional enrichment analysis), mirDip (microRNA Data Integration Portal) e NAViGaTOR (Network Analysis, Visualization, & GraphingTORonto). Resultados e Discussão: Foram identificados 60 pseudogenes desregulados em adenocarcinoma pulmonar, sendo que 34 destes fora... (Resumo completo, clicar acesso eletrônico abaixo) / Abstract: Background: Lung adenocarcinoma is the most common histological subtype of lung cancer and is associated with high rates of patient death (>1 million), every year. Clinically useful biomarkers have been identified; among these, non-coding RNAs and pseudogenes have a potent role in the regulation of miRNA target genes and parental genes, respectively, which are associated with tumorigenesis pathways. Objectives: To identify alterations in pseudogene expression in lung adenocarcinoma using transcriptome data (RNA-Seq). Material and Methods: This study included 27 lung adenocarcinoma and 10 histologically normal tissues, adjacent to the tumors, from the same patients. RNA-Seq data were generated on the Illumina HiScan SQ platform and utilized for application of a data analysis strategy (pipeline) in order to identify pseudogene sequences with abnormal expression in tumor compare to normal tissues. Pseudogenes with significantly altered expression (p<0,05) were validated using external dataset The Cancer Genome Atlas (TCGA) and subsequently used for in silico functional analysis, using computational tools including IID (Integrated Interactions Database), TOPPGENES (functional enrichment analysis), mirDip (microRNA Data Integration Portal) and NAViGaTOR (Network Analysis, Visualization, & Graphing TORonto). Results and Discussion: A total of 60 deregulated pseudogenes were identified in pulmonary adenocarcinoma, 34 of which were validated in the TCGA database. Some pseudogenes sho... (Complete abstract click electronic access below) / Doutor
|
5 |
Análise transcriptômica da interação mamoeiro-Papaya Meleira VirusMadroñero, Leidy Johana 27 November 2014 (has links)
Submitted by Maykon Nascimento (maykon.albani@hotmail.com) on 2016-03-22T19:26:21Z
No. of bitstreams: 2
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5)
Dissertacao Leidy Johana Madronero.pdf: 2812224 bytes, checksum: d1702433cce9cfe12ed08f4146e98588 (MD5) / Approved for entry into archive by Patricia Barros (patricia.barros@ufes.br) on 2016-03-23T15:36:18Z (GMT) No. of bitstreams: 2
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5)
Dissertacao Leidy Johana Madronero.pdf: 2812224 bytes, checksum: d1702433cce9cfe12ed08f4146e98588 (MD5) / Made available in DSpace on 2016-03-23T15:36:18Z (GMT). No. of bitstreams: 2
license_rdf: 23148 bytes, checksum: 9da0b6dfac957114c6a7714714b86306 (MD5)
Dissertacao Leidy Johana Madronero.pdf: 2812224 bytes, checksum: d1702433cce9cfe12ed08f4146e98588 (MD5) / CAPES / O mamoeiro (Carica papaya L.) é uma das fruteiras mais cultivadas nas regiões tropicais e subtropicais do mundo. O Brasil faz parte do grupo dos países que mais produzem e exportam mamão no mundo. O Espírito Santo e a Bahia são responsáveis por mais de 70% da área brasileira produtora deste fruto. Porém, doenças causadas por microrganismos infecciosos afetam de modo considerável sua produção. Entre as principais doenças, destaca-se a meleira do mamoeiro, causada pelo Papaya meleira virus (PMeV), que ainda não possui uma cultivar resistente. Interessantemente os sintomas somente são desencadeados após a frutificação. Os mecanismos moleculares envolvidos no desenvolvimento dos sintomas e na resposta de defesa da planta ao PMeV ainda não foram esclarecidos. Para entender os pontos chaves desta interação, que permitam o desenvolvimento de metodologias de melhoramento genético, um estudo transcriptômico foi abordado. A tecnologia RNA-seq foi usada para o sequenciamento do transcriptoma a partir de plantas com 3, 6 e 8 meses de idade após plantio, inoculadas e não inoculadas com o PMeV. Os genes diferencialmente expressos nos 3 tempos e nas duas condições foram preditos e analisados. Estas análises revelaram um padrão de expressão geral dos genes envolvidos nesta interação. Foram encontrados 21 genes com o perfil de expressão alterado nas plantas inoculadas exclusivamente nos seis meses de idade. Destes, 8 genes envolvidos em processos de respostas de defesa e morte celular, resposta ao estresse e resposta ao estímulo biótico e abiótico foram reprimidos; enquanto os demais (13 genes), envolvidos principalmente em processos metabólicos primários, biogêneses, diferenciação e ciclo celular, comunicação e crescimento celular, bem como processos envolvidos em reprodução, e desenvolvimento da floração, foram superexpressos. Estes resultados sugerem que, aos seis meses de idade, a planta é obrigada a alterar seu programa de expressão gênica, direcionando a resposta para os processos próprios do desenvolvimento, requeridos nesse estádio fisiológico, que primam sob a resposta ao estresse, fato que finalmente leva ao desenvolvimento dos sintomas. / Papaya is one of the fruit crops most cultivated in tropical and subtropical regions. Brazil is a major producer and exporter of papaya in the world. The largest area in Brazil, about 70%, for producing papaya is located in Espiritu Santo and Bahia. However this production is affected by infectious diseases caused by pathogens. The sticky disease caused by Papaya meleira virus (PMeV) is one of the most sever diseases. Not resistance has been reported for sticky disease and interestingly their symptoms only are triggered at the ripening. The molecular mechanisms involved in both the symptoms’ development and in the papaya defense response are still unclear. To understand the key point in this pathosystem leading to purpose crops genetic improvement methodologies we conducted a transcriptomics study. Rna-seq technology was used to sequencing the transcriptome from PMeV inoculated and no inoculated plants with 3, 6 and 8 months old. The differentially expressed genes in the both conditions and in the three times were found. Using different graphics analysis we show the global gene expression patterns in this interaction. We found 21 genes exhibit an altered profile at six month just in the inoculated condition. 8 genes related with defense response like cellular death and stress responses and biotic and abiotic stimulus were down regulated whereas 13 genes involved with primary metabolic process, biogenesis, cell differentiation, cell cycle, cell communication, cell grown, well as in reproduction and flower development were up regulated. This results suggest that in the six month the plant is forced to change their gene expression program routed to response for the physiological processes involved just at this period and should this is being favored over the stress response leading to the symptoms development.
|
6 |
Differences in global gene expression in muscle tissue of Nellore cattle with divergent meat tendernessFonseca, Larissa Fernanda Simielli, Gimenez, Daniele Fernanda Jovino, dos Santos Silva, Danielly Beraldo, Barthelson, Roger, Baldi, Fernando, Ferro, Jesus Aparecido, Albuquerque, Lucia Galvão 04 December 2017 (has links)
Background: Meat tenderness is the consumer's most preferred sensory attribute. This trait is affected by a number of factors, including genotype, age, animal sex, and pre-and post-slaughter management. In view of the high percentage of Zebu genes in the Brazilian cattle population, mainly Nellore cattle, the improvement of meat tenderness is important since the increasing proportion of Zebu genes in the population reduces meat tenderness. However, the measurement of this trait is difficult once it can only be made after animal slaughtering. New technologies such as RNA-Seq have been used to increase our understanding of the genetic processes regulating quantitative traits phenotypes. The objective of this study was to identify differentially expressed genes related to meat tenderness, in Nellore cattle in order to elucidate the genetic factors associated with meat quality. Samples were collected 24 h postmortem and the meat was not aged. Results: We found 40 differentially expressed genes related to meat tenderness, 17 with known functions. Fourteen genes were up-regulated and 3 were down-regulated in the tender meat group. Genes related to ubiquitin metabolism, transport of molecules such as calcium and oxygen, acid-base balance, collagen production, actin, myosin, and fat were identified. The PCP4L1 (Purkinje cell protein 4 like 1) and BoLA-DQB (major histocompatibility complex, class II, DQ beta) genes were validated by qRT-PCR. The results showed relative expression values similar to those obtained by RNA-Seq, with the same direction of expression (i.e., the two techniques revealed higher expression of PCP4L1 in tender meat samples and of BoLA-DQB in tough meat samples). Conclusions: This study revealed the differential expression of genes and functions in Nellore cattle muscle tissue, which may contain potential biomarkers involved in meat tenderness.
|
7 |
Identification of Candidate Resistance Genes in Multiple Herbicide Resistant Echinochloa ColonaWright, Alice Ann 06 May 2017 (has links)
Herbicide resistance is increasing in incidence among weed populations and poses a threat to food security. In Sunflower County, MS, a population of junglerice was identified with resistance to four herbicides, fenoxaprop-P-ethyl, imazamox, quinclorac, and propanil, each representing a different mechanism of action. The target site of fenoxaprop-P-ethyl, acetyl coenzyme A carboxylase (ACCase), was investigated. The ACCase contained none of the known resistance-conferring point mutations and an enzyme assay revealed no difference in response to increasing levels of fenoxaprop-P-ethyl between the resistant biotype and a sensitive biotype, indicating that the ACCase enzyme in the resistant biotype was sensitive to the herbicide. Whole-plant dose response assays in the presence and absence of cytochrome P450 and glutathione-S-transferase (GST) inhibitors did not increase efficacy of fenoxaprop-P-ethyl in the resistant biotype. However, when malathion, a cytochrome P450 inhibitor, was applied with imazamox or quinclorac, a reduction in resistance was observed in the resistant biotype, suggesting that a cytochrome P450 was important to the resistance mechanism for these two herbicides. RNA was isolated from the resistant and sensitive biotypes before and one hour after imazamox treatment for RNA-seq analysis. The reads from all samples were pooled to assemble the first E. colona leaf transcriptome. Differential gene expression analysis comparing untreated and treated samples for both biotypes revealed that several stress response genes were upregulated following herbicide exposure. A time course examining six of these genes showed that expression peaked between 4 and 12 hours and then dropped to untreated levels by 48 hours. Comparison of untreated resistant and sensitive plants revealed that a kinase and GST were significantly upregulated in the resistant biotype and an F-box protein was significantly downregulated. SNP analysis of cytochrome P450 sequences identified several nonsynonymous point mutations of interest including two transcripts that had premature stop codons in the sensitive but not the resistant biotype. These transcripts and their products should be the subject of future studies to determine if and how they are involved in resistance.
|
8 |
A QUASI-LIKELIHOOD METHOD TO DETECT DIFFERENTIALLY EXPRESSED GENES IN RNA-SEQUENCE DATAGu, Chu-Shu January 2016 (has links)
In recent years, the RNA-sequencing (RNA-seq) method, which measures the transcriptome by counting short sequencing reads obtained by high-throughput sequencing, is replacing the microarray technology as the major platform in gene expression studies. The large amount of discrete data in RNA-seq experiments calls for effective analysis methods. In this dissertation, a new method to detect differentially expressed genes based on quasi-likelihood theory is developed in experiments with a completely randomized design with two experimental conditions. The proposed method estimates the variance function empirically and consequently it has similar sensitivities and FDRs across distributions with different variance functions. In a simulation study, the method is shown to have similar sensitivities and FDRs across the data with three different types of variance functions compared with some other popular methods. This method is applied to a real dataset with two experimental conditions along with some competing methods.
The new method is then extended to more complex designs such as an experiment with multiple experimental conditions, an experiment with block design and an experiment with factorial design. The same advantages for the new method have been found in simulation studies. This method and some competing methods are applied to three real datasets with complex designs.
The new method is also applied to analyze reads per kilobase per million mapped reads (RPKM) data. In the simulation, the method is compared with the Linear Models for Microarray Data (LIMMA) originally developed for microarray analysis (Smyth, 2004) and the question of normalization is also examined. It is shown that the new method and the LIMMA method have similar performance. Further normalization is required for the proper analysis of the RPKM data and the best such normalization is the scaling method. Analyzing raw count data properly has better performance than analyzing the RPKM data. Different normalization and statistical methods are applied to a real dataset with varied gene length across samples. / Thesis / Doctor of Philosophy (PhD)
|
9 |
CADMIUM EXPOSURE ALTERS GENE EXPRESSION OF LENS, RETINA, AND EYE-RELATED GENES IN ZEBRAFISH AND HUMAN LENS EPITHELIAL CELLSSrinivasan, Krishna January 2018 (has links)
Vision is a crucial aspect of life for humans and animals. Impaired vision can lead
to reduced quality of life along with other complications. Cataracts are a leading
cause of impaired vision and blindness worldwide. Cataracts develop as a process of
aging, although several environmental and lifestyle factors increase the risk of this
disease. The toxic metal cadmium (Cd) has been associated with cataract formation
and other ocular diseases such as macular degeneration. Cadmium exposure experiments
were conducted to investigate potential pathways or mechanisms by which
Cd may contribute to cataract formation and ocular disease. Zebra fish larvae (72,
96, and 120 hours post fertilization), adult zebra sh (6-month male, 10-month male,
and 10-month female) and the B3 human lens epithelial (HLE) cell line were acutely
exposed to varying concentrations of Cd. Transcriptomic changes relative to control
(0 μM Cd) were determined using microarray analysis for zebra sh larvae and
RNA sequencing (RNA-Seq) for adult zebra sh and HLE cells. Gene Ontology (GO)
enrichment analysis for the zebra sh larvae exposure (50 μM Cd for 4 or 8 hours)
enriched the "retina development in camera-type eye" term, and genes involved in
enrichment (dnmt1, ccna2, fen1, mcm3 and slbp) were down-regulated. Gene set
enrichment analysis (GSEA) for the 10-month male zebra sh exposure (50 μM Cd
for 4 hours) enriched the "embryonic eye morphogenesis" gene set and signi ficant
genes involved in enrichment (tcf7l1a, pitx2, fzd8a, sfrp5, lmx1bb, mfap2, six3b, lum,
phactr4b, and foxc1a) were down-regulated. GSEA for the 10-month female zebra sh
(50 μM Cd for 4 hours) enriched the "photoreceptor cell differentiation" gene set and
signi cant genes involved in enrichment (odc1, thrb, and ush2a) were up-regulated.
GO enrichment analysis for up-regulated genes in the HLE cell exposure (10 μM Cd
for 4 hours) enriched the terms "eye development" (22 genes) and "lens development
in camera-type eye" (CITED2, SKIL, CRYAB, SLC7A11, TGFB2, EPHA2, BCAR3,
WNT5B, and BMP4). These results show cadmium is capable of altering transcription
of eye-related genes in both zebra sh and human models, which may contribute
to the formation of ocular disease. Many of these genes are involved in lens and
retina development, yet they are also associated with diseases in these eye structures.
Future studies could assess the consequences of altered transcription of these genes
which could help elucidate the mechanisms of these changes and the overall effect of
cadmium exposure on ocular disease. Ultimately, our study characterized the regulation
of eye-related genes in response to Cd exposure and provided valuable knowledge
laying the foundation for identi fication of the molecular mechanisms contributing to
ocular diseases. / Thesis / Master of Science (MSc) / The eye is a sphere-like organ which is important for visualizing your surroundings. It is composed of many different structures such as the cornea, lens and retina. Many eye diseases have been characterized by abnormalities in eye structures; for example, a cataract occurs when the lens becomes cloudy and unable to focus light while macular degeneration is defined by progressive deterioration of the retinal macula region. While these diseases can occur through the natural aging process, certain environmental factors can increase risk. Exposure to cadmium, a toxic heavy metal which causes negative effects in animals, has shown to be associated with eye disease like cataracts and macular degeneration. In order to expand on this knowledge, we exposed both zebrafish and human lens cells to cadmium. By utilizing different experimental methods such as microarray analysis and RNA sequencing, eye-related genes which were affected by cadmium were revealed. Identifying the relationship between eye diseases, cadmium and gene expression will help identify the mechanism by which cadmium contributes to eye disease formation.
|
10 |
Characterization of 16S rRNA 3’ Termini Using RNA-Seq DataSilke, Jordan 08 April 2019 (has links)
Optimizing the production of useful macromolecules from transgenic microorganisms is crucial to biopharmaceutical companies. Improving bacterial growth and replication depends largely on the efficiency of translation, which is rate-limited by initiation. Among the most important interactions between the mRNA translation initiation region (TIR) and the translation machinery is the association between the Shine-Dalgarno (SD) sequence in the TIR and the complementary anti-SD (aSD) sequence which is located within a short unstructured segment that includes the 3’ terminus (3’ TAIL) of the mature 16S rRNA. However, the mature 3’ TAIL has been poorly characterized in the majority of bacteria, rendering optimal SD/aSD pairing unclear in these species.
In light of this, we established a novel strategy to characterize the mature 3’ TAILs of bacterial species that leverages the availability of publically stored RNA sequencing (RNA-Seq) data. In chapter 2, we devised a RNA-Seq-based approach to successfully recover the experimentally verified 3’ TAIL in E. coli (5’-CCUCCUUA-3’) and resolve inconsistencies surrounding the identity of the 3’ TAIL in Bacillus subtilis. In chapter 3 we improve the method introduced in chapter 2 to clearly and more reliably define the 3’ TAIL termini for 13 bacterial species with available protein abundance data.
Our results reveal considerable heterogeneity in the termini of 3’ TAILs among closely related species and that sites downstream of the canonical CCUCC aSD motif are more important to initiation than previously believed. My research contributes to advance our understanding in microbial translation efficiency in two significant ways: 1) providing an RNA-Seq-based approach to characterize rRNA transcripts, and 2) elucidating optimal recognition between protein-coding genes and the rRNA translation machinery.
|
Page generated in 0.032 seconds