The Human Genome Project completed in 2003 and the draft of human genome sequences were also yielded. It has been known that any two human gnomes are almost identical, and only very little difference makes human diversities. Single nucleotide polymorphism (SNP) means that a single-base nucleotide changes in DNA. A SNP sequence from one of a pair of chromosomes is called a haplotype. In this thesis, we study how to reconstruct a pair of chromosomes from a given set of fragments obtained by DNA sequencing in an individual. We define a new problem, the chromosome pair assembly problem, for the chromosome reconstruction. The goal of the problem is to find a pair of sequences such that the pair of output sequences have the minimum mismatch with the input fragments and their lengths are minimum. We first transform the problem instance into a directed multigraph. And then we propose an efficient algorithm to solve the problem. We apply the ACO algorithm to optimize the ordering of input fragments and use dynamic programming to determine SNP sites. After the chromosome pair is reconstructed, the two haplotypes can also be determined. We perform our algorithm on some artificial test data. The experiments show that our results are near the optimal solutions of the test data.
Identifer | oai:union.ndltd.org:NSYSU/oai:NSYSU:etd-0824105-164952 |
Date | 24 August 2005 |
Creators | Wei, Liang-Tai |
Contributors | Yow-Ling Shine, Chang-Biau Yang, Yue-Li Wang, Wu-Chih Hu, Chia-Ning Yang |
Publisher | NSYSU |
Source Sets | NSYSU Electronic Thesis and Dissertation Archive |
Language | English |
Detected Language | English |
Type | text |
Format | application/pdf |
Source | http://etd.lib.nsysu.edu.tw/ETD-db/ETD-search/view_etd?URN=etd-0824105-164952 |
Rights | off_campus_withheld, Copyright information available at source archive |
Page generated in 0.0018 seconds