Global ETD Search

11	Identification of bacteria with ambiguous biochemical profiles by 16S ribosomal RNA gene sequencing Chan, Yin-leung. January 2001 (has links) Thesis (M. Med. Sc.)--University of Hong Kong, 2001. / Includes bibliographical references (leaves 38-43). Bacteria Nucleotide sequence. Ribosomes.
12	Granulicatella, abiotrophia, and gemella bacteremia characterized by 16S ribosomal RNA gene sequencing / Chiu, Siu-kau. January 2002 (has links) Thesis (M. Med. Sc.)--University of Hong Kong, 2002. / Includes bibliographical references (leaves 16-19). Bacterial genetics. Nucleotide sequence.
13	Update and evaluation of 16SpathDB, an automated comprehensive database for identification of medically important bacteria by 16S rRNA gene sequencing Yeung, Shiu-yan, 楊兆恩 January 2013 (has links) Identification of pathogens is one of the important duties of clinical microbiology laboratory. Traditionally, phenotypic tests are used to identify the bacteria. However, due to some limitations of the phenotypic tests, the bacteria may not be identified sometimes and cannot be identified promptly. 16S rRNA gene sequencing is a rapid and accurate method to achieve this target. It is especially useful for identify rare or slow growing bacteria. However, the interpretation of the 16S rRNA gene sequencing result is one of the challenging duties to laboratory technicians and microbiologists. Apart from the well known 16S rRNA gene databases such as Genbank, The Ribosomal Database Project (RDP-II), MicroSeq databases, Ribosomal Differentiation of medical Microorganism database (RIDOM), SmartGene IDNS, 16SpathDB is an automated and comprehensive database for interpret the 16S rRNA gene result. The 16SpathDB first version was established in 2011. In this study, 16SpathDB was updated based on the all clinical important bacteria present in the 10th edition of the Manual of Clinical Microbiology (MCM)(Versalovic. et al., 2011) into this new version of database, 16SpathDB 2.0. The database was evaluated by using 689 16S rRNA gene sequences from 689 complete genomes of medically important bacteria. Among the 689 16S rRNA gene sequences, none was wrongly identified by 16SpathDB 2.0, with 247 (35.8%) 16S rRNA gene sequences reported in only one single bacterial species with more than 98% nucleotide identity with the query sequence (category 1), 440 (63.9%) reported as more than one bacterial species having more than 98% nucleotide identity with the query sequence (category 2), 2 (0.3%) reported to the genus level (category 3), and none reported as “no species in 16SpathDB 2.0 found to be sharing high nucleotide identity to your query sequence” (category 4). 16SpathDB 2.0 is an updated, automated, user-friendly, efficient and accurate database for 16S rRNA gene sequence interpretation in clinical microbiology laboratories. / published_or_final_version / Microbiology / Master / Master of Medical Sciences Pathogenic bacteria Nucleotide sequence
14	Efficient methods for improving the sensitivity and accuracy of RNA alignments and structure prediction Li, Yaoman, 李耀满 January 2013 (has links) RNA plays an important role in molecular biology. RNA sequence comparison is an important method to analysis the gene expression. Since aligning RNA reads needs to handle gaps, mutations, poly-A tails, etc. It is much more difficult than aligning other sequences. In this thesis, we study the RNA-Seq align tools, the existing gene information database and how to improve the accuracy of alignment and predict RNA secondary structure. The known gene information database contains a lot of reliable gene information that has been discovered. And we note most DNA align tools are well developed. They can run much faster than existing RNA-Seq align tools and have higher sensitivity and accuracy. Combining with the known gene information database, we present a method to align RNA-Seq data by using DNA align tools. I.e. we use the DNA align tools to do alignment and use the gene information to convert the alignment to genome based. The gene information database, though updated daily, there are still a lot of genes and alternative splicings that hadn't been discovered. If our RNA align tool only relies on the known gene database, then there may be a lot reads that come from unknown gene or alternative splicing cannot be aligned. Thus, we show a combinational method that can cover potential alternative splicing junction sites. Combining with the original gene database, the new align tools can cover most alignments which are reported by other RNA-Seq align tools. Recently a lot of RNA-Seq align tools have been developed. They are more powerful and faster than the old generation tools. However, the RNA read alignment is much more complicated than other sequence alignment. The alignments reported by some RNA-Seq align tools have low accuracy. We present a simple and efficient filter method based on the quality score of the reads. It can filter most low accuracy alignments. At last, we present a RNA secondary prediction method that can predict pseudoknot(a type of RNA secondary structure) with high sensitivity and specificity. / published_or_final_version / Computer Science / Master / Master of Philosophy Nucleotide sequence - Data processing
15	A fast and accurate model to detect germline SNPs and somatic SNVs with high-throughput sequencing Wang, Weixin, 王煒欣 January 2014 (has links) The rapid development of high-throughput sequencing technology provides a new chance to extend the scale and resolution of genomic research. How to efficiently and accurately call genetic variants in single base level (germline single nucleotide polymorphisms (SNPs) or somatic single nucleotide variants (SNVs)) is the fundamental challenge in sequencing data analysis, because these variants reported to influence transcriptional regulation, alternative splicing, non-coding RNA regulation and protein coding. Many applications have been developed to tackle this challenge. However, the shallow depth and cellular heterogeneity make those tools cannot attain satisfactory accuracy, and the huge volume of sequencing data itself cause this process inefficient. In this dissertation, firstly the performance of prevalent reads aligners and SNP callers for second-generation sequencing (SGS) is evaluated. And due to the high GC-content, the significantly lower coverage and poorer SNP calling performance in the regulatory regions of human genome by SGS is investigated. To enhance the capability to call SNPs, especially within the lower-depth regions, a fast and accurate SNP detection (FaSD) program that uses a binomial distribution based algorithm and a mutation probability is proposed. Based on the comparison with popular software and benchmarked by SNP arrays and high-depth sequencing data, it is demonstrated that FaSD has the best SNP calling accuracy in the aspects of genotype concordance rate and AUC. Furthermore, FaSD can finish SNP calling within four hours for 10X human genome SGS data on a standard desktop computer. Lastly, combined with the joint genotype likelihoods, an updated version of FaSD is proposed to call the cancerous somatic SNVs between paired tumor and normal samples. With extensive assessments on various types of cancer, it is demonstrated that no matter benchmarked by the known somatic SNVs and germline SNPs from database, or somatic SNVs called from higher-depth data, FaSD-somatic has the best overall performance. Inherited and improved from FaSD, FaSD-somatic is also the fastest somatic SNV caller among current programs, and can finish calling somatic mutations within 14 hours for 50X paired tumor and normal samples on normal server. / published_or_final_version / Biochemistry / Doctoral / Doctor of Philosophy Chromosome polymorphism Nucleotide sequence
16	Binning and annotation for metagenomic next-generation sequencing reads Wang, Yi, 王毅 January 2014 (has links) The development of next-generation sequencing technology enables us to obtain a vast number of short reads from metagenomic samples. In metagenomic samples, the reads from different species are mixed together. So, metagenomic binning has been introduced to cluster reads from the same or closely related species and metagenomic annotation is introduced to predict the taxonomic information of each read. Both metagenomic binning and annotation are critical steps in downstream analysis. This thesis discusses the difficulties of these two computational problems and proposes two algorithmic methods, MetaCluster 5.0 and MetaAnnotator, as solutions. There are six major challenges in metagenomic binning: (1) the lack of reference genomes; (2) uneven abundance ratios; (3) short read lengths; (4) a large number of species; (5) the existence of species with extremely-low-abundance; and (6) recovering low-abundance species. To solve these problems, I propose a two-round binning method, MetaCluster 5.0. The improvement achieved by MetaCluster 5.0 is based on three major observations. First, the short q-mer (length-q substring of the sequence with q = 4, 5) frequency distributions of individual sufficiently long fragments sampled from the same genome are more similar than those sampled from different genomes. Second, sufficiently long w-mers (length-w substring of the sequence with w ≈ 30) are usually unique in each individual genome. Third, the k-mer (length-k substring of the sequence with k ≈ 16) frequencies from reads of a species are usually linearly proportional to that of the species’ abundance. The metagenomic annotation methods in the literatures often suffer from five major drawbacks: (1) unable to annotate many reads; (2) less precise annotation for reads and more incorrect annotation for contigs; (3) unable to deal with novel clades with limited references genomes well; (4) performance affected by variable genome sequence similarities between different clades; and (5) high time complexity. In this thesis, a novel tool, MetaAnnotator, is proposed to tackle these problems. There are four major contributions of MetaAnnotator. Firstly, instead of annotating reads/contigs independently, a cluster of reads/contigs are annotated as a whole. Secondly, multiple reference databases are integrated. Thirdly, for each individual clade, quadratic discriminant analysis is applied to capture the similarities between reference sequences in the clade. Fourthly, instead of using alignment tools, MetaAnnotator perform annotation using k-mer exact match which is more efficient. Experiments on both simulated datasets and real datasets show that MetaCluster 5.0 and MetaAnnotator outperform existing tools with higher accuracy as well as less time and space cost. / published_or_final_version / Computer Science / Doctoral / Doctor of Philosophy Nucleotide sequence - Data processing
17	Granulicatella, abiotrophia, and gemella bacteremia characterized by 16S ribosomal RNA gene sequencing 招紹裘, Chiu, Siu-kau. January 2002 (has links) published_or_final_version / Medical Sciences / Master / Master of Medical Sciences Bacterial genetics. Nucleotide sequence.
18	Further development of the visual genome explorer: a visual genomic comparative tool 鄭啓航, Cheng, Kai-hong. January 2001 (has links) published_or_final_version / Zoology / Master / Master of Philosophy Nucleotide sequence - Computer programs.
19	Exploiting high throughput DNA sequencing data for genomic analysis Fritz, Markus Hsi-Yang January 2012 (has links) No description available. 570.285 Nucleotide sequence ; Genomics
20	Rapid methods for the detection of toxic cyanobcteria / Fergusson, Kim Marie. Unknown Date (has links) Thesis (PhD)--University of South Australia, 2003. Cyanobacteria Nucleotide sequence Genes

Search results