• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 59
  • 7
  • 3
  • 2
  • Tagged with
  • 70
  • 70
  • 43
  • 19
  • 16
  • 15
  • 15
  • 13
  • 12
  • 10
  • 9
  • 8
  • 8
  • 8
  • 8
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
11

Pandemic Vibrio parahaemolyticus : defining strains using molecular typing and a growth advantage at lower temperatures /

Davis, Carisa Renee. January 2008 (has links)
Dissertation (Ph.D.)--University of South Florida, 2008. / Includes vita. Includes bibliographical references. Also available online.
12

Data mining of post genome-wide association studies and next generation sequencing

桂宏胜, Gui, Hongsheng January 2013 (has links)
abstract / Psychiatry / Doctoral / Doctor of Philosophy
13

Predicting functional impact of nonsynonymous mutations by quantifying conservation information and detect indels using split-read approach

Zeng, Shuai, 曾帥 January 2014 (has links)
The rapidly developing sequencing technology has brought up an opportunity to scientists to look into the detailed genotype information in human genome. Computational programs have played important roles in identifying disease related genomic variants from huge amount of sequencing data. In the past years, a number of computational algorithms have been developed, solving many crucial problems in sequencing data analysis, such as mapping sequencing reads to genome and identifying SNPs. However, many difficult and important issues are still expecting satisfactory solutions. A key challenge is identifying disease related mutations in the background of non-pathogenic polymorphisms. Another crucial problem is detecting INDELs especially the long deletions under the technical limitations of second generation sequencing technology. To predict disease related mutations, we developed a machine learning-based (Random forests) prediction tool, EFIN (Evaluation of Functional Impact of Nonsynonymous mutations). We build A Multiple Sequence Alignment (MSA) for a querying protein with its homologous sequences. MSA is later divided into different blocks according to taxonomic information of the sequences. After that, we quantified the conservation in each block using a number of selected features, for example, entropy, a concept borrowed from information theory. EFIN was trained by Swiss-Prot and HumDiv datasets. By a series of fair comparisons, EFIN showed better results than the widely-used algorithms in terms of AUC (Area under ROC curve), accuracy, specificity and sensitivity. The web-based database is provided to worldwide user at paed.hku.hk/efin. To solve the second problem, we developed Linux-based software, SPLindel that detects deletions (especially long deletions) and insertions using second generation sequencing data. For each sample, SPLindel uses split-read method to detect the candidate INDELs by building alternative references to go along with the reference sequences. And then we remap all the relevant reads using both original references and alternative allele references. A Bayesian model integrating paired-end information was used to assign the reads to the most likely locations on either the original reference allele or the alternative allele. Finally we count the number of reads that support the alternative allele (with insertion or deletions comparing to the original reference allele) and the original allele, and fit a beta-binomial mixture model. Based on this model, the likelihood for each INDEL is calculated and the genotype is predicted. SPLindel runs about the same speed as GATK and DINDEL, but much faster than DINDEL. SPLindel obtained very similar results as GATK and DINDEL for the INDELs of size 1-15 bps, but is much more effective in detecting INDELs of larger size. Using machine learning method and statistical modeling technology, we proposed the tools to solve these two important problems in sequencing data analysis. This study will help identify novel damaging nsSNPs more accurately and efficiently, and equip researcher with more powerful tool in identifying INDELs, especially long deletions. As more and more sequencing data are generated, methods and tools introduced in this thesis may help us extract useful information to facilitate identification of causal mutations to human diseases. / published_or_final_version / Paediatrics and Adolescent Medicine / Doctoral / Doctor of Philosophy
14

A study on predicting gene relationship from a computational perspective

Chan, Pui-yee., 陳沛儀. January 2004 (has links)
published_or_final_version / abstract / toc / Computer Science and Information Systems / Master / Master of Philosophy
15

A multi-agent model for DNA analysis

高銘謙, Ko, Ming-him. January 1999 (has links)
published_or_final_version / Electrical and Electronic Engineering / Master / Master of Philosophy
16

Biomolecular characterization of mumps virus genotypes with varying neurovirulence /

Tecle, Tesfaldet, January 2002 (has links)
Diss. (sammanfattning) Stockholm : Karolinska institutet, 2002. / Härtill 6 uppsatser.
17

Characterization of the in vivo functions of Y-Family DNA polymerases kappa and Rev1

Kosarek, Jayme Nicole January 2008 (has links)
Dissertation (Ph.D.) -- University of Texas Southwestern Medical Center at Dallas, 2008. / Vita. Bibliography: p. 117-123.
18

Approximate string alignment and its application to ESTs, mRNAs and genome mapping

Yim, Cheuk-hon, Terence., 嚴卓漢. January 2004 (has links)
published_or_final_version / abstract / Computer Science and Information Systems / Master / Master of Philosophy
19

Finding motif pairs from protein interaction networks

Siu, Man-hung., 蕭文鴻. January 2008 (has links)
published_or_final_version / Computer Science / Master / Master of Philosophy
20

Evaluating and Improving the Efficiency of Software and Algorithms for Sequence Data Analysis

Eaves, Hugh L 01 January 2016 (has links)
With the ever-growing size of sequence data sets, data processing and analysis are an increasingly large portion of the time and money spent on nucleic acid sequencing projects. Correspondingly, the performance of the software and algorithms used to perform that analysis has a direct effect on the time and expense involved. Although the analytical methods are widely varied, certain types of software and algorithms are applicable to a number of areas. Targeting improvements to these common elements has the potential for wide reaching rewards. This dissertation research consisted of several projects to characterize and improve upon the efficiency of several common elements of sequence data analysis software and algorithms. The first project sought to improve the efficiency of the short read mapping process, as mapping is the most time consuming step in many data analysis pipelines. The result was a new short read mapping algorithm and software, demonstrated to be more computationally efficient than existing software and enabling more of the raw data to be utilized. While developing this software, it was discovered that a widely used bioinformatics software library introduced a great deal of inefficiency into the application. Given the potential impact of similar libraries to other applications, and because little research had been done to evaluate library efficiency, the second project evaluated the efficiency of seven of the most popular bioinformatics software libraries, written in C++, Java, Python, and Perl. This evaluation showed that two of libraries written in the most popular language, Java, were an order of magnitude slower and used more memory than expected based on the language in which they were implemented. The third and final project, therefore, was the development of a new general-purpose bioinformatics software library for Java. This library, known as BioMojo, incorporated a new design approach resulting in vastly improved efficiency. Assessing the performance of this new library using the benchmark methods developed for the second project showed that BioMojo outperformed all of the other libraries across all benchmark tasks, being up to 30 times more CPU efficient than existing Java libraries.

Page generated in 0.0578 seconds