Spelling suggestions: "subject:"9nucleotide sequence.our programs"" "subject:"9nucleotide sequencecomponentof programs""
1 |
Further development of the visual genome explorer: a visual genomic comparative tool鄭啓航, Cheng, Kai-hong. January 2001 (has links)
published_or_final_version / Zoology / Master / Master of Philosophy
|
2 |
Heuristic and self-training methods for improving gene prediction in prokaryotesBesemer, John David 12 1900 (has links)
No description available.
|
3 |
RNA secondary structure prediction and an expert systems methodology for RNA comparative analysis in the genomic eraDoshi, Kishore John, 1974- 28 August 2008 (has links)
The ability of certain RNAs to fold into complicated secondary and tertiary structures provides them with the ability to perform a variety of functions in the cell. Since the secondary and tertiary structures formed by certain RNAs in the cell are central to understanding how they function, one of the most active areas of research has been how to accurately and reliably predict RNA secondary structure from sequence; better known as the RNA Folding Problem. This dissertation examines two fundamental areas of research in RNA structure prediction, free energy minimization and comparative analysis. The most popular RNA secondary structure prediction program, Mfold 3.1 predicts RNA secondary structure via free energy minimization using experimentally determined energy parameters. I present an evaluation of the accuracy of Mfold 3.1 using the largest set of phylogenetically diverse, comparatively predicted RNA secondary structures available. This evaluation will show that despite significant revisions to the energy parameters, the prediction accuracy of Mfold 3.1 is not significantly improved when compared to previous versions. In contrast, RNA comparative analysis has repeatedly demonstrated the ability to accurately and reliably predict RNA secondary structure. The downside is that RNA comparative analysis frequently requires an expert systems methodology which is predominately manual in nature. As a result, RNA comparative analysis is not capable of scaling adequately to be useful in the genomic era. Therefore, I developed the Comparative Analysis Toolkit (CAT) which is intended to be the fundamental component of a vertically integrated software infrastructure to facilitate high-throughput RNA comparative analysis using an expert systems methodology.
|
4 |
On multiple sequence alignmentWang, Shu, 1973- 29 August 2008 (has links)
The tremendous increase in biological sequence data presents us with an opportunity to understand the molecular and cellular basis for cellular life. Comparative studies of these sequences have the potential, when applied with sufficient rigor, to decipher the structure, function, and evolution of cellular components. The accuracy and detail of these studies are directly proportional to the quality of these sequences alignments. Given the large number of sequences per family of interest, and the increasing number of families to study, improving the speed, accuracy and scalability of MSA is becoming an increasingly important task. In the past, much of interest has been on Global MSA. In recent years, the focus for MSA has shifted from global MSA to local MSA. Local MSA is being needed to align variable sequences from different families/species. In this dissertation, we developed two new algorithms for fast and scalable local MSA, a three-way-consistency-based MSA and a biclustering -based MSA. The first MSA algorithm is a three-way-Consistency-Based MSA (CBMSA). CBMSA applies alignment consistency heuristics in the form of a new three-way alignment to MSA. While three-way consistency approach is able to maintain the same time complexity as the traditional pairwise consistency approach, it provides more reliable consistency information and better alignment quality. We quantify the benefit of using three-way consistency as compared to pairwise consistency. We have also compared CBMSA to a suite of leading MSA programs and CBMSA consistently performs favorably. We also developed another new MSA algorithm, a biclustering-based MSA. Biclustering is a clustering method that simultaneously clusters both the domain and range of a relation. A challenge in MSA is that the alignment of sequences is often intended to reveal groups of conserved functional subsequences. Simultaneously, the grouping of the sequences can impact the alignment; precisely the kind of dual situation biclustering algorithms are intended to address. We define a representation of the MSA problem enabling the application of biclustering algorithms. We develop a computer program for local MSA, BlockMSA, that combines biclustering with divide-and-conquer. BlockMSA simultaneously finds groups of similar sequences and locally aligns subsequences within them. Further alignment is accomplished by dividing both the set of sequences and their contents. The net result is both a multiple sequence alignment and a hierarchical clustering of the sequences. BlockMSA was compared with a suite of leading MSA programs. With respect to quantitative measures of MSA, BlockMSA scores comparable to or better than the other leading MSA programs. With respect to biological validation of MSA, the other leading MSA programs lag BlockMSA in their ability to identify the most highly conserved regions.
|
5 |
Development of NASBA-primer search software for designing forensic saliva tandem repeat markers for mucin and amylase / Development of nucleic acid sequence based amplification primer search software for designing forensic saliva tandem repeat markers for mucin and amylaseAra, Andleeb. January 2009 (has links)
Nucleic Acid Sequence Based Amplification (NASBA) is a powerful in vitro technique for amplification. NASBA is routinely used in many fields of microbiology, including food microbiology, and most recently in the identification of forensic body secretions (saliva, tears, sweat, semen, vaginal secretions). NASBA has many advantages over the traditional Reverse Transcriptase Polymerase Chain Reaction (RT-PCR) including speed, high sample throughput and increased sensitivity. Proper selection of the sequence of importance and the designe of NASBA primers precisely for that sequence are the two most critical steps for any NASBA assay. Proper designe of NASBA primers includes important considerations such as product (amplicon) length, the addition of a T7 RNA polymerase promoter sequence at the 5’end of one primer, and a 3’AT rich region.
Primer designing is, therefore, laborious and error-prone. Currently, no such software is available that facilitates primer designing for NASBA. In this study Java-based software for designing NASBA primers was developed which will enable rapid and specific NASBA primer designing for gene expression studies. The designed program focused on scripting minimum Java coding lines to reduce the time and storage space. Two codes were scripted for this software, a pseudo-code (for Java Program Developers) and Byte-code (for Windows operating system). Our results showed that Java is an efficient tool in searching sequences of interest within a gene (or mRNA), allowing for NASBA primers to be designed more quickly and effortlessly. The program has maximum memory storage capacity and allows the users to retrieve old data with reference to date or time. To test the practicality of the newly developed program, gene sequences of salivary mucin and amylase were examined to facilitate extraction of novel RNA tandem repeat element NASBA markers for human saliva forensic identification. Tandem repeats of non-coding portion with in the of human genome are highly polymorphic and considered best for forensic use. Currently, only 13 Short Tandem Repeat (STR) markers are available for forensic case work, which are not enough to establish a definitive link between the victim and suspects. Identification and validation of a new human body fluid tandem repeat markers is cruicial for achieving high through put results and to exonerate the innocent. / Access to thesis permanently restricted to Ball State community only / Department of Biology
|
6 |
Computational Methods for Inferring Mechanisms of Biological Heterogeneity in Single-Cell DataPersad, Sitara Camini January 2024 (has links)
Single-cell sequencing techniques, such as single-cell RNA sequencing (scRNA-seq) and single-cell ATAC sequencing (scATAC-seq), have revolutionized our understanding of cellular diversity and function. Genetic and epigenetic factors influence phenotypic heterogeneity in ways that are just beginning to be understood. In this work, we develop methods for inferring mechanisms of biological heterogeneity in single-cell data, with particular applications to cancer biology.
First, we develop a kernel archetype analysis method for overcoming noise and sparsity in single-cell data by aggregating single cells into high-resolution cell states. We show that the proposed approach captures robust and biologically meaningful cell states and enables the inference of epigenetic regulation of phenotypic heterogeneity. In the second part of this thesis, we develop methods for linking genotypic and phenotypic information, first by using aggregated single-cell RNA sequencing and a hidden Markov model to infer copy number variation. We demonstrate that aggregation improves copy number inference over existing approaches.
We then integrate DNA sequencing with single-cell RNA sequencing to infer copy number profiles in a rapid autopsy of a patient with metastatic pancreatic cancer. We develop a scalable algorithm for inferring phylogenetic relationships between cells from noisy copy number profiles. We show that our approach more accurately recovers phylogenetic relationships between cells and apply it to understand the relationship between genotype and phenotype in metastatic cancer. Finally, we develop a metric for quantifying the extent to which genotype determines phenotype in lineage tracing data. We show that it more accurately quantifies phenotypic plasticity compared to existing approaches. Altogether, these methods can be used to help uncover the mechanisms underlying phenotypic heterogeneity in biological systems.
|
Page generated in 0.1081 seconds