Return to search

Haplotype Inference as a caseof Maximum Satisfiability : A strategy for identifying multi-individualinversion points in computational phasing

Phasing genotypes from sequence data is an important step betweendata gathering and downstream analysis in population genetics,disease studies, and multiple other fields. This determination ofthe sequences of markers corresponding to the individualchromosomes can be done on data where the markers are in lowdensity across the chromosome, such as from single nucleotidepolymorphism (SNP) microarrays, or on data with a higher localdensity of markers like in next generation sequencing (NGS). Thesorted markers may then be used for many different analyses anddata processing such as linkage analysis, or inference of missinggenotypes in the process of imputation cnF2freq is a haplotype phasing program that uses an uncommonapproach allowing it to divide big groups of related individualsinto smaller ones. It sets an initial haplotype phase and theniteratively changes it using estimations from Hidden MarkovModels. If a marker is judged to have been placed in the wronghaplotype, a switch needs to be made so that it belongs to thecorrect phase. The objective of this project was to go fromallowing only one individual within a group to be switched in aniteration to allowing multiple switches that are dependent on eachother. The result of this project is a theoretical solution for allowingmultiple dependent switches in cnF2freq, and an implementedsolution using the max-SAT solver toulbar2.

Identiferoai:union.ndltd.org:UPSALLA1/oai:DiVA.org:uu-331675
Date January 2017
CreatorsBergman, Ebba
PublisherUppsala universitet, Molekylär evolution
Source SetsDiVA Archive at Upsalla University
LanguageEnglish
Detected LanguageEnglish
TypeStudent thesis, info:eu-repo/semantics/bachelorThesis, text
Formatapplication/pdf
Rightsinfo:eu-repo/semantics/openAccess
RelationUPTEC X ; 17 005

Page generated in 0.0017 seconds