Return to search

Statistical issues in modelling the ancestry from Y-chromosome and surname data

A considerable industry has grown-up around genealogical inference from genetic testing, supplementing more traditional genealogical techniques but with very limited quantification of uncertainty. In many societies Y-chromosomes are co-inherited with surnames and as such passed down from father to son. This thesis seeks to explore what the correlation can say about ancestry. In particular it is concerned with estimation of the time to the most recent common paternal ancestor (TMRCA) for pairs of males who are not known to be directly related but share the same surname, based on the repeat number at short tandem repeat (STR) markers on their Y-chromosomes. We develop a model of TMRCA estimation based on the difference in repeat numbers in pairs of male haplotypes using a Bayesian framework and Markov-Chain Monte-Carlo techniques, such as adaptive Metropolis-Hastings algorithm. The model incorporates the process of STR discovery and the calibration of mutation rates, which can differ across STRs. In simulation studies, we find that the estimates of TMRCA are rather robust to the ascertainment process and the way in which it is modelled. However, they are affected by the site-specific mutation rates at the typed STRs. Indeed sequencing the fastest mutating STRs yields a lower error in the estimated TMRCA than random STRs. In the British context, we extend our model to include additional information such as the haplogroup status (as determined from single nucleotide polymorphisms, SNPs) of the pair of males, as well as the frequency and origin of the surname. In general, the effect of this is to reduce estimates of the TMRCA for pairs of males with an older TMRCA, typically outwith the period of surname establishment (about 500-700 years ago). In the genealogical context, incorporating surname frequency (within the prior distribution) results in lower estimates of TMRCA for pairs of males who appear to have diverged from a common male ancestor since the period of surname establishment. In addition, we include uncertainty in the years per generation conversion factor in our model.

Identiferoai:union.ndltd.org:bl.uk/oai:ethos.bl.uk:559951
Date January 2012
CreatorsSharif, Maarya
PublisherUniversity of Glasgow
Source SetsEthos UK
Detected LanguageEnglish
TypeElectronic Thesis or Dissertation
Sourcehttp://theses.gla.ac.uk/3407/

Page generated in 0.5229 seconds