Spelling suggestions: "subject:"computational mining"" "subject:"eomputational mining""
1 |
Computational Mining and Survey of Simple Sequence Repeats (SSRs) in Expressed Sequence Tags (ESTs) of Dicotyledonous PlantsKumpatla, Siva Prasad 07 1900 (has links)
Submitted to the faculty of the School of
Informatics in partial fulfillment of the requirements for the degree Master of Science in Bioinformatics in the School of Informatics,Indiana University July, 2004 / DNA markers have revolutionized the field of genetics by increasing the pace of genetic analysis. Simple sequence repeats (SSRs) are repetitions of nucleotide motifs of 1 to 5 bases and are currently the markers of choice in many plant and animal genomes due to their abundant distribution in the genomes, hypervariable nature and suitability for high-throughput analysis. While SSRs, once developed, are extremely valuable, their development is time consuming, laborious and expensive. Sequences from many genomes are continuously made freely available in the public databases and mining of these sources using computational approaches permits rapid and economical marker development. Expressed sequence tags (ESTs) are ideal candidates for mining SSRs not only because of their availability in large numbers but also due to the fact that they represent expressed genes. Large scale SSR mining efforts in plants to date focused on monocotyledonous plants. In this project, an efficient SSR identification tool was developed and used to mine SSRs from more than 53 dicotyledonous species. A total of 92,648 non-redundant ESTs or 6.0% of the 1.54 million dicotyledonous ESTs investigated in this study were found to contain SSRs. The frequency of non-redundant-ESTs containing SSRs among the species investigated ranged from 2.65% to 16.82%. More than 80% of the non-redundant ESTs having SSRs contained a single SSR repeat while others contained 2 or more SSRs. An extensive analysis of the occurrence and frequencies of various SSR types revealed that the A/T mononucleotide, AG/GA/CT/TC dinucleotide, AAG/AGA/GAA/CTT/TTC/TCT trinucleotide and TTTA and TTAA tetranucleotide repeats are the most abundant in dicotyledonous species. In addition, an analysis of the number of repeats across species revealed that majority of the
mononucleotide SSRs contained 15-25 repeats while majority of the di- and tri-nucleotide SSRs contained 5-10 repeats. By providing valuable information on the abundance of SSRs in ESTs of a large number of dicotyledonous species, this study demonstrates the potential of computational mining approach for rapid discovery of SSRs towards the development of markers for genetic analysis and related applications.
|
2 |
Computational mining for terminator-like genes in soybeanMahmood, Hamida January 1900 (has links)
Master of Science / Genetics Interdepartmental Program - Plant Pathology / Frank F. White / Sanzhen Liu / Plants and bacterial pathogens are in constant co-evolution to survive and sustain the next generation. Plants have two well-characterized levels of active defense -pathogens-associated molecular patterns (PAMPs)-triggered immunity (PTI) and effectors-triggered immunity (ETI). Some plants that are hosts for bacterial pathogens employing type three secretion system transcription activator-like (TAL) effectors have evolved a unique form of ETI, namely TAL effector-mediated ETI. TAL effectors induce expression of specific disease susceptibility (S) genes. Rice and pepper have evolved resistance genes termed terminator (T) genes, which have promoters that bind TAL effectors and, upon expression of the T gene, elicit a hypersensitive reaction (HR) and cell death. Only five T genes have been cloned, and the origin of most T genes is unknown. To determine the presence of candidate T genes in other plants species, a bioinformatics-based mining was designed. The basic approach utilized three structural features common to four terminator genes: a short trans-membrane domain, a secretion signal domain, and a length of <200 amino acid residues. Soybean was chosen as the test plant species, and 161 genes were retrieved that fulfilled the three parameters using R and Perl software programs. Further, functional annotation of candidate genes was conducted by comparisons to genes in public databases. Major classes of proteins found included unique and hypothetical, defense/stress/oxidative stress associated, DNA-binding, kinases, transferases, hydrolases, effector-related tRNA splicing, and F- box domain proteins. The potential T genes will serve as candidates for experimental validation and new resources for durable resistance strategies in crop species.
|
Page generated in 0.1419 seconds