Spelling suggestions: "subject:"9nucleotide sequence"" "subject:"dinucleotide sequence""
141 |
Analyses of ribosomal DNA internal transcribed spacer sequences from : Juglans nigra and leaf-associated fungi in Zoar Valley, NY /Ragozine, Vincent. January 2008 (has links)
Thesis (M.S.)--Youngstown State University, 2008. / Includes bibliographical references (leaves 37-39). Also available via the World Wide Web in PDF format.
|
142 |
The study and comparison of maize centromeric sequences /Page, Brent January 2000 (has links)
Thesis (Ph. D.)--University of Missouri-Columbia, 2000. / Typescript. Vita. Includes bibliographical references (leaves 168-176). Also available on the Internet.
|
143 |
The study and comparison of maize centromeric sequencesPage, Brent January 2000 (has links)
Thesis (Ph. D.)--University of Missouri-Columbia, 2000. / Typescript. Vita. Includes bibliographical references (leaves 168-176). Also available on the Internet.
|
144 |
Codon usage biases of influenza A virusesWong, Hoi-man, Emily. January 2009 (has links)
Thesis (M. Phil.)--University of Hong Kong, 2010. / Includes bibliographical references (p. 178-187). Also available in print.
|
145 |
Data mining of post genome-wide association studies and next generation sequencing桂宏胜, Gui, Hongsheng January 2013 (has links)
abstract / Psychiatry / Doctoral / Doctor of Philosophy
|
146 |
Predicting functional impact of nonsynonymous mutations by quantifying conservation information and detect indels using split-read approachZeng, Shuai, 曾帥 January 2014 (has links)
The rapidly developing sequencing technology has brought up an opportunity to scientists to look into the detailed genotype information in human genome. Computational programs have played important roles in identifying disease related genomic variants from huge amount of sequencing data.
In the past years, a number of computational algorithms have been developed, solving many crucial problems in sequencing data analysis, such as mapping sequencing reads to genome and identifying SNPs. However, many difficult and important issues are still expecting satisfactory solutions. A key challenge is identifying disease related mutations in the background of non-pathogenic polymorphisms. Another crucial problem is detecting INDELs especially the long deletions under the technical limitations of second generation sequencing technology.
To predict disease related mutations, we developed a machine learning-based (Random forests) prediction tool, EFIN (Evaluation of Functional Impact of Nonsynonymous mutations). We build A Multiple Sequence Alignment (MSA) for a querying protein with its homologous sequences. MSA is later divided into different blocks according to taxonomic information of the sequences. After that, we quantified the conservation in each block using a number of selected features, for example, entropy, a concept borrowed from information theory. EFIN was trained by Swiss-Prot and HumDiv datasets. By a series of fair comparisons, EFIN showed better results than the widely-used algorithms in terms of AUC (Area under ROC curve), accuracy, specificity and sensitivity. The web-based database is provided to worldwide user at paed.hku.hk/efin.
To solve the second problem, we developed Linux-based software, SPLindel that detects deletions (especially long deletions) and insertions using second generation sequencing data. For each sample, SPLindel uses split-read method to detect the candidate INDELs by building alternative references to go along with the reference sequences. And then we remap all the relevant reads using both original references and alternative allele references. A Bayesian model integrating paired-end information was used to assign the reads to the most likely locations on either the original reference allele or the alternative allele. Finally we count the number of reads that support the alternative allele (with insertion or deletions comparing to the original reference allele) and the original allele, and fit a beta-binomial mixture model. Based on this model, the likelihood for each INDEL is calculated and the genotype is predicted. SPLindel runs about the same speed as GATK and DINDEL, but much faster than DINDEL. SPLindel obtained very similar results as GATK and DINDEL for the INDELs of size 1-15 bps, but is much more effective in detecting INDELs of larger size.
Using machine learning method and statistical modeling technology, we proposed the tools to solve these two important problems in sequencing data analysis. This study will help identify novel damaging nsSNPs more accurately and efficiently, and equip researcher with more powerful tool in identifying INDELs, especially long deletions. As more and more sequencing data are generated, methods and tools introduced in this thesis may help us extract useful information to facilitate identification of causal mutations to human diseases. / published_or_final_version / Paediatrics and Adolescent Medicine / Doctoral / Doctor of Philosophy
|
147 |
RNA secondary structure prediction and an expert systems methodology for RNA comparative analysis in the genomic eraDoshi, Kishore John, 1974- 28 August 2008 (has links)
The ability of certain RNAs to fold into complicated secondary and tertiary structures provides them with the ability to perform a variety of functions in the cell. Since the secondary and tertiary structures formed by certain RNAs in the cell are central to understanding how they function, one of the most active areas of research has been how to accurately and reliably predict RNA secondary structure from sequence; better known as the RNA Folding Problem. This dissertation examines two fundamental areas of research in RNA structure prediction, free energy minimization and comparative analysis. The most popular RNA secondary structure prediction program, Mfold 3.1 predicts RNA secondary structure via free energy minimization using experimentally determined energy parameters. I present an evaluation of the accuracy of Mfold 3.1 using the largest set of phylogenetically diverse, comparatively predicted RNA secondary structures available. This evaluation will show that despite significant revisions to the energy parameters, the prediction accuracy of Mfold 3.1 is not significantly improved when compared to previous versions. In contrast, RNA comparative analysis has repeatedly demonstrated the ability to accurately and reliably predict RNA secondary structure. The downside is that RNA comparative analysis frequently requires an expert systems methodology which is predominately manual in nature. As a result, RNA comparative analysis is not capable of scaling adequately to be useful in the genomic era. Therefore, I developed the Comparative Analysis Toolkit (CAT) which is intended to be the fundamental component of a vertically integrated software infrastructure to facilitate high-throughput RNA comparative analysis using an expert systems methodology.
|
148 |
Molecular characterization of a rare bacterial pathogen causing psoas abscess嚴德貞, Yim, Tak-ching. January 2003 (has links)
published_or_final_version / Medical Sciences / Master / Master of Medical Sciences
|
149 |
Phylogenetic footprinting and modeling of the gp130 family of cytokinesLo, Wing-sheung, James., 羅永裳. January 2004 (has links)
published_or_final_version / Medical Sciences / Master / Master of Medical Sciences
|
150 |
Defining novel clinical syndromes and emerging pathogensWoo, Chiu-yat, Patrick., 胡釗逸. January 2002 (has links)
published_or_final_version / abstract / toc / Medicine / Master / Doctor of Medicine
|
Page generated in 0.0729 seconds