Global ETD Search

1	STR amplification of DNA mixtures: fidelity of contributor proportion when calculated from DNA profile data using known mixture samples Huang, Rui Fen January 2013 (has links) DNA mixtures are frequently encountered in forensic casework especially in cases of sexual assault. When evidence is recovered, the sample may have come from multiple contributors in different proportions. The first part of this study examines the fidelity of contributor proportions by using the residual to analyze known mixture samples. The coefficient of determination between the expected and observed proportions was also determined and used to assess the fidelity of mixture proportions. The second part of this study involved separating major and minor contributors in a mixture by characterizing the observed proportions. Results for the 2-person mixture show that as the mass of amplified DNA decreases, the number of allele dropouts increases. Furthermore, as mass decreases, the level of variation between the expected and observed proportions increases, as determined by the residuals and the coefficients of determination. In addition, as mixture proportions become more disparate the amount of variations between the expected and observed proportions are not as great as the mass. For the 3-person mixtures, as mass decreases, the residuals increase. Also, when the coefficient of determination of the 3-person mixtures were compared to those obtained with the 2-person mixtures, it was determined that the R2 were larger for the former. This was a result of higher total amplification masses. In mixture 1:2/2:1, major and minor proportions are not distinguishable In mixture 1:4/4:1, major and minor proportions can be distinguished at 1 ng. In mixture 1:9/9:1, proportions are distinguishable at 1, and 0.5 ng. Mixtures could not be distinguished at the 0.25 ng level, despite proportion and is the result of the increase in variation with decreasing mass. Short tandem repeats (STR) Forensic casework DNA
2	Taxas de mutação de 14STRs autossômicas na população de Pernambuco ANDRADE, Edilene Santos de 31 January 2008 (has links) Made available in DSpace on 2014-06-12T18:03:38Z (GMT). No. of bitstreams: 2 arquivo3714_1.pdf: 874379 bytes, checksum: 1a27595bb5f2f12e52d96545dd1b301b (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2008 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / A definição das taxas de mutação dos locos de microssatélites ou Short Tandem Repeats (STRs) usados em análises forenses são úteis para a correta interpretação dos resultados dos perfis genéticos e para a definição dos critérios de exclusão em testes de paternidade. Mutações da linhagem germinativa de 14 locos de STRs foram estudadas através das análises de 54.105 transferências alélicas genitorcriança a partir de 2.575 casos de testes de paternidade realizados durante 2000- 2007 na população de Pernambuco, Nordeste do Brasil. O parentesco, em cada um desses casos, foi altamente validado (probabilidade > 99.99%). Foram identificadas 43 mutações em 12 locos. As taxas de mutação específicas para cada loco variaram entre 2 x 10-4 e 2 x 10-3, e a taxa de mutação total foi estimada em 8 x 10-4. Eventos de mutação na linhagem germinativa masculina foram mais freqüentes do que na feminina. A maioria das mutações (95%) pode ser explicada pela perda ou ganho de uma unidade repetitiva e não houve evidência para seleção entre mutações de adição ou deleção. Nossos dados foram comparados aos dados referentes a populações americanas e européias e demonstraram que as taxas de mutação dos locos de STRs não diferem entre as diferentes populações Short Tandem Repeats Taxas de mutação Testes de paternidade
3	Estimativa de mistura étnica avaliada por Mercadores Informativos de Ancestralidade (AIMs) e Microssatélites (STRs) / Estimativa de mistura étnica avaliada por Mercadores Informativos de Ancestralidade (AIMs) e Microssatélites (STRs) Teló, Enio Paulo January 2010 (has links) Submitted by Ana Maria Fiscina Sampaio (fiscina@bahia.fiocruz.br) on 2012-07-16T21:36:49Z No. of bitstreams: 1 Enio Paulo Estimativa de mistura étnica avaliada por Marcadores Informativos de.pdf: 352598 bytes, checksum: 7d448dc54afe1ec271f59fc912275f41 (MD5) / Made available in DSpace on 2012-07-16T21:36:49Z (GMT). No. of bitstreams: 1 Enio Paulo Estimativa de mistura étnica avaliada por Marcadores Informativos de.pdf: 352598 bytes, checksum: 7d448dc54afe1ec271f59fc912275f41 (MD5) Previous issue date: 2010 / Fundação Oswaldo Cruz. Centro de Pesquisas Gonçalo Moniz. Salvador, Bahia, Brasil / A miscigenação entre os três principais grupos étnicos (ameríndios, europeus e africanos) originou a alta diversidade genética da população brasileira. Na Bahia a proporção de afrodescendentes é de 77,5%, sendo que em Salvador 79,8% se auto-denominam negros ou pardos. Poucos estudos descrevem a diversidade genética da população baiana e a contribuição de cada grupo étnico na sua formação. Diversos marcadores de DNA são atualmente utilizados para estimar mistura étnica em populações miscigenadas. Estes marcadores são denominados alelos específicos de população (PSAs) ou marcadores informativos de ancestralidade (AIMs) e apresentam alelos com grandes diferenciais de freqüência, superiores a 30%, entre populações geográfica ou etnicamente definidas. Os microssatélites (STRs) são variantes genéticos úteis no mapeamento genético de espécies, na identificação de pessoas, mapeamento genético e análise de populações. Alguns STRs apresentam alelos com freqüências marcantes em determinados grupos populacionais. Com objetivo de comparar a ancestralidade genomica avaliada com dois tipos de marcadores, foram estudados 8 microssatélites STRs autossômicos (TH01, vWA31, D18S51, FGA, TPOX, D7S820, D3S1358, D8S1179) e 9 AIMs (FY-Null, LPL, AT3-I/D, Sb19.3, APO, PV92, CYP3A4, CKMM, GC-1F e GC-1S), em 203 indivíduos miscigenados da Bahia. A genotipagem foi realizada por PCR (Polimerase Chain Reaction), para deleções, inserções e para os microssatélites e PCR quantitativo em tempo real para mutações pontuais. As contribuições africana, européia e ameríndia observadas foram respectivamente 33,5%, 58,6% e 7,9% para os STRs e 45,08%, 45,16% e 9,75% para os AIMs, comprovando a miscigenação da população. O Índice Kappa, mostrou que a concordância entre as estimativas de ancestralidade utilizando os dois tipos de marcadores (AIMs e STRs), foi muito baixa (kappa = 0,12). Foi observada associação entre sobrenome de conotação religiosa e ancestralidade africana / The mixing between the three main ethnic groups (Amerindians, Europeans and Africans) produced a high genetic diversity of the braziliam population. In Bahia, the proportion of African descent that call themselves black or brown is 77.5% and 79.8% in Salvador. Few studies describe the genetic diversity of the population of Bahia and the contribution of each ethnic group in its formation. Several DNA markers are currently used to estimate ethnic mix in admixed populations. These markers are called alleles specific population (PSAs) or ancestry informative markers (AIMs) and carry alleles with large differences in frequency above 30% between populations geographically or ethnically defined. Microsatellites (STRs) are useful genetic variants in the genetic mapping of species, identification of persons, genetic mapping and analysis of populations. Some STRs have alleles with frequencies marked in certain population groups. To compare the ancestry genomica evaluated with two types of markers were studied 8 microsatellite autosomal STRs (TH01, vWA31, D18S51, FGA, TPOX, D7S820, D3S1358, D8S1179) and 9 AIMs (FY-Null, LPL, AT3-I /D, Sb19.3, APO, PV92, CYP3A4, CK-MM, GC and GC-1F-1S) in 203 subjects with mixed Bahia. Genotyping was performed by PCR (Polymerase Chain Reaction), for deletions, insertions and for microsatellite and quantitative PCR in real time for mutations. The contributions of African, European and Amerindian observed were respectively 33.5%, 58.6% and 7.9% for the STRs and 45.08%, 45.16% and 9.75% for the AIMs, proving the mixing of population. The Kappa index showed that the correlation between the estimates of ancestry using both types of markers (AIMs and STRs), was very low (kappa = 0.12). Association was found between devotional surnames and African ancestry. Ancestralidade genômica Short Tandem Repeats Genomic Ancestry Short Tandem Repeats Ancestry Informative Markers
4	Microsatellite Evolution in The Yeast Genome - A Genomic Approach Merkel, Angelika January 2008 (has links) Microsatellites are short (1-6bp long) highly polymorphic tandem repeats, found in all genomes analyzed so far. Popular genetic markers for many applications including population genetics, pedigree analysis, genetic mapping and linkage analysis, some microsatellites also can cause a variety of human neurodegenerative diseases and may act as agents of adaptive evolution through the regulation of gene expression. As a consequence of these diverse uses and functions, the mutational and evolutionary dynamics of microsatellite sequences have gained much attention in recent years. Mostly, the focus of studies investigating microsatellite evolution has been to develop more refined evolutionary models for estimating parameters such as genetic distance or linkage disequilibrium. However, there is an incentive in using our understanding of the evolutionary processes that affect these sequences to examine the functional implications of microsatellite evolution. What has emerged from nearly two decades of study are highly complex mutational dynamics, with mutation rates varying across species, loci and alleles, and a multitude of potential influences on these rates, most of which are not yet fully understood. The increasing availability of whole genome sequences has immensely extended the scope for studying microsatellite evolution. For example, where once it was common to examine single loci, it is now possible to examine microsatellites using genome wide approaches. In the first part of my dissertation I discuss approaches and issues associated with detecting microsatellites in genomic data. In Chapter 2 I undertook a meta-analysis of studies investigating the distribution of microsatellites in yeast and showed that studies comparing the distribution of microsatellites in genomic data can be fraught due to the application of different definitions for microsatellites by different investigators. In particular, I found that variation in how investigators choose the repeat unit size of a microsatellite, handle imperfections in the array and especially the choice of minimum array length used, leads to a large divergence in results and can distort the conclusions drawn from such studies, particularly where inter-specific comparisons are being made. In a review of the currently available suite of bioinformatics tools (Chapter 3), I further showed that this bias extends beyond a solely theoretical controversy into a methodological issue because most software tools not only incorporate different definitions for the key parameters used to define microsatellites, but also employ different strategies to search and filter for microsatellites in genomic data. In this chapter I provide an overview of the available tools and a practical guide to help other researchers choose the appropriate tool for their research purpose. In the second part of my thesis, I use the analytical framework developed from the previous chapters to explore the biological significance of microsatellites exploiting the well annotated genome of the model organism Saccharomyces cerevisiae (baker’s yeast). Several studies in different organisms have indicated spatial associations between microsatellites and individual genomic features, such as transposable elements, recombinational hotspots, GC-content or local substitution rate. In Chapter 4, I summarized these studies and tested some of the underlying hypotheses on microsatellite distribution in the yeast genome using Generalized Linear Models (GLM) and wavelet transformation. I found that microsatellite type and distribution within the genome is strongly governed by local sequence composition and negative selection in coding regions, and that microsatellite frequency is inversely correlated with SNP density reflecting the stabilizing effect point mutations have on microsatellites. Microsatellites may also be markers for recent genome modifications, due to their depletion in regions nearby LTR transposons, and elements of potential structural importance, since I found associations with features such as meiotic double strand breaks, regulatory sites and nucleosomes. Microsatellites are subject to local genomic influences, particularly on small (1-2kb) scales. Although, these local scale influences might not be as dominant as other factors on a genome-wide scale they are certainly of importance with respect to individual loci. Analysis of locus conservation across 40 related yeast strains (Chapter 5) showed no bias in the type of microsatellites conserved, only a negative influence of coding sequences, which supports again the idea that microsatellites evolve neutrally. Polymorphism was rare, and despite a positive correlation with array length, there was no relationship with either genomic fraction or repeat size. However, the analysis also revealed a non-random distribution of microsatellites in genes of functionally distinct groups. For example, conserved microsatellites (similar to general microsatellites in yeast) are mostly found in genes associated with the regulation of biological and cellular processes. Polymorphic loci show further an association with the organization and biogenesis of cellular components, morphogenesis, development of anatomical structures and pheromone response, which, is absent for monomorphic loci. Whether this distribution is an indication of functionality or simply neutral mutation (e.g. genetic hitch-hiking) is debatable since most conserved microsatellites, particularly variable loci, are located within genes that show low selective constraints. Overall, microsatellites appear as neutrally evolving sequences, but owing to the sheer number of loci within a single genome, individual loci may well acquire some functionality. More work is definitely needed in this area, particularly experimental studies, such as reporter-gene expression assays, to confirm phenotypic effects. microsatellites short tandem repeats software comperative genomics yeast
5	INTER-KINGDOM EPIGENETICS: CHARACTERIZATION OF MAIZE B1 TANDEM REPEAT-MEDIATED SILENCING IN DROSOPHILA MELANOGASTER McEachern, Lori A. 19 August 2010 (has links) Transgenic organisms are a valuable tool for studying epigenetics, as they provide significant insight into the evolutionary conservation of epigenetic control sequences, the interacting proteins, and the underlying molecular mechanisms. Paramutation is an epigenetic phenomenon in which the epigenetic status and expression level of one allele is heritably altered after pairing with another. At the b1 locus in maize, a control region consisting of seven 853 bp tandem repeats is required for paramutation. To study the conservation of the epigenetic mechanisms underlying maize b1 paramutation, I created transgenic Drosophila carrying the maize b1 control region flanked by FRT sites and adjacent to the Drosophila white reporter gene. The maize b1 tandem repeats caused epigenetic silencing in Drosophila, as white expression consistently increased following repeat removal. A single copy of the tandem repeat sequence was sufficient to cause silencing, and silencing strength increased as the number of repeats increased. Trans interactions, such as pairing-sensitive silencing, were also observed and appear to require a threshold number of b1 tandem repeats, similar to paramutation in maize. Analysis of transcription from the repeats showed that the b1 tandem repeats are transcribed from both strands in Drosophila, as they are in maize. Bidirectional transcription was found to extend to the regions flanking the repeats, and persisted in “repeats-out” transgenes following repeat removal. However, aberrant transcription was lost when a zero-repeat transgene was moved to a new genomic position, suggesting that it may be due to an epigenetic mark that is retained from the previous silenced state. A search for modifiers of b1 repeat-mediated silencing demonstrated that Polycomb group proteins are involved. Together, these results indicate considerable conservation of an epigenetic silencing process between the plant and animal kingdoms. Genomic imprinting is a related epigenetic process in which parent-specific epigenetic states are inherited and maintained in progeny. The conservation of epigenetic mechanisms was further explored via an in-depth review of the molecular mechanisms underlying genomic imprinting in plants, mammals and insects, and identification of potentially imprinted genes in Drosophila by microarray analysis. Epigenetics Drosophila melanogaster Paramutation Tandem repeats Transcription Silencing Genomic imprinting
6	Microsatellite Evolution in The Yeast Genome - A Genomic Approach Merkel, Angelika January 2008 (has links) Microsatellites are short (1-6bp long) highly polymorphic tandem repeats, found in all genomes analyzed so far. Popular genetic markers for many applications including population genetics, pedigree analysis, genetic mapping and linkage analysis, some microsatellites also can cause a variety of human neurodegenerative diseases and may act as agents of adaptive evolution through the regulation of gene expression. As a consequence of these diverse uses and functions, the mutational and evolutionary dynamics of microsatellite sequences have gained much attention in recent years. Mostly, the focus of studies investigating microsatellite evolution has been to develop more refined evolutionary models for estimating parameters such as genetic distance or linkage disequilibrium. However, there is an incentive in using our understanding of the evolutionary processes that affect these sequences to examine the functional implications of microsatellite evolution. What has emerged from nearly two decades of study are highly complex mutational dynamics, with mutation rates varying across species, loci and alleles, and a multitude of potential influences on these rates, most of which are not yet fully understood. The increasing availability of whole genome sequences has immensely extended the scope for studying microsatellite evolution. For example, where once it was common to examine single loci, it is now possible to examine microsatellites using genome wide approaches. In the first part of my dissertation I discuss approaches and issues associated with detecting microsatellites in genomic data. In Chapter 2 I undertook a meta-analysis of studies investigating the distribution of microsatellites in yeast and showed that studies comparing the distribution of microsatellites in genomic data can be fraught due to the application of different definitions for microsatellites by different investigators. In particular, I found that variation in how investigators choose the repeat unit size of a microsatellite, handle imperfections in the array and especially the choice of minimum array length used, leads to a large divergence in results and can distort the conclusions drawn from such studies, particularly where inter-specific comparisons are being made. In a review of the currently available suite of bioinformatics tools (Chapter 3), I further showed that this bias extends beyond a solely theoretical controversy into a methodological issue because most software tools not only incorporate different definitions for the key parameters used to define microsatellites, but also employ different strategies to search and filter for microsatellites in genomic data. In this chapter I provide an overview of the available tools and a practical guide to help other researchers choose the appropriate tool for their research purpose. In the second part of my thesis, I use the analytical framework developed from the previous chapters to explore the biological significance of microsatellites exploiting the well annotated genome of the model organism Saccharomyces cerevisiae (baker’s yeast). Several studies in different organisms have indicated spatial associations between microsatellites and individual genomic features, such as transposable elements, recombinational hotspots, GC-content or local substitution rate. In Chapter 4, I summarized these studies and tested some of the underlying hypotheses on microsatellite distribution in the yeast genome using Generalized Linear Models (GLM) and wavelet transformation. I found that microsatellite type and distribution within the genome is strongly governed by local sequence composition and negative selection in coding regions, and that microsatellite frequency is inversely correlated with SNP density reflecting the stabilizing effect point mutations have on microsatellites. Microsatellites may also be markers for recent genome modifications, due to their depletion in regions nearby LTR transposons, and elements of potential structural importance, since I found associations with features such as meiotic double strand breaks, regulatory sites and nucleosomes. Microsatellites are subject to local genomic influences, particularly on small (1-2kb) scales. Although, these local scale influences might not be as dominant as other factors on a genome-wide scale they are certainly of importance with respect to individual loci. Analysis of locus conservation across 40 related yeast strains (Chapter 5) showed no bias in the type of microsatellites conserved, only a negative influence of coding sequences, which supports again the idea that microsatellites evolve neutrally. Polymorphism was rare, and despite a positive correlation with array length, there was no relationship with either genomic fraction or repeat size. However, the analysis also revealed a non-random distribution of microsatellites in genes of functionally distinct groups. For example, conserved microsatellites (similar to general microsatellites in yeast) are mostly found in genes associated with the regulation of biological and cellular processes. Polymorphic loci show further an association with the organization and biogenesis of cellular components, morphogenesis, development of anatomical structures and pheromone response, which, is absent for monomorphic loci. Whether this distribution is an indication of functionality or simply neutral mutation (e.g. genetic hitch-hiking) is debatable since most conserved microsatellites, particularly variable loci, are located within genes that show low selective constraints. Overall, microsatellites appear as neutrally evolving sequences, but owing to the sheer number of loci within a single genome, individual loci may well acquire some functionality. More work is definitely needed in this area, particularly experimental studies, such as reporter-gene expression assays, to confirm phenotypic effects. microsatellites short tandem repeats software comperative genomics yeast
7	Développement et application de méthodes bioinformatiques pour l'analyse des protéines contenant des répétitions en tandem / Development and application of bioinformatics methods for the identification and characterisation of tandem repeat in protein sequences Richard, François D. 21 October 2016 (has links) De nos jours, l’augmentation du volume des données de séquençage est bien plus forte que celle de notre capacité à analyser ces données. En lien avec ce déluge de données et le besoin urgent de nouveaux outils bioinformatiques pour les analyser, notre travail consiste à développer de nouveaux algorithmes pour mieux comprendre les relations entre séquence, structure, et fonction des protéines. Les protéines contiennent de larges portions de séquences périodiques, qui forment des motifs d’acides aminés répétés les uns à la suite des autres que l’on appelle des répétitions en tandem. Elles se retrouvent dans 14% des protéines. De nombreuses études ont montré leur importance fonctionnelle ainsi que leur implication dans de nombreuses maladies humaines, notamment le cancer. Ici, nous montrons l’importance d’adopter une approche incluant plusieurs outils de détection de répétition en tandem afin de s’assurer d’obtenir le jeu de données le plus complet. Nous avons ainsi réalisé un pipeline approprié, et développé deux outils spécifiques : un filtre, pour gagner en rapidité, et un score, pour sélectionner les répétitions les plus pertinentes dans les régions structurées des protéines. Enfin, nous avons utilisé ce pipeline sur une sélection de 94 protéomes. Cette analyse a permis de mettre à jour le précédent recensement des répétitions, montrant que 64% des protéines contenaient des répétitions en tandem. Elle a également permis de mieux comprendre les répétions en tandem dans leurs caractéristiques, leurs compositions et leurs implications dans les maladies humaines. / Today, the growth of protein sequencing data significantly exceeds the growth of capacities to analyze these data. In line with this data deluge and urgent needs in new bioinformatics tools our work deals with the development of new algorithms to better understand the sequence-structure-function relationship. Proteins contain a large portion of periodic sequences representing arrays of repeats that are directly adjacent to each other, so called tandem repeats (TRs). TRs occur at least in 14% of all proteins. Highly divergent, they range from a single amino acid repetition to domains of 100 or more repeated residues. Numerous studies demonstrated the fundamental functional importance of such TRs and their involvement in human diseases, especially cancers. Here we show the importance of integrating several TR detectors to get the most complete set of TRs in proteomes. We designed an appropriate pipeline and developed a filter to speed the process as well as a new scoring module to select relevant structured TRs. In addition, we undertook a large scale analysis of TRs in 94 proteomes. This large scale analysis allowed us to update previous census of TR showing that TRs occurs in 64% of all proteins and leads to a better understanding of TR in terms of their characteristics, composition and implication in human disease. Bioinformatique Répétitions en tandem Séquences Protéomes Bioinformatics Tandem repeats Sequences Proteomes
8	Flexible finite automata-based algorithms for detecting microsatellites in DNA De Ridder, Corne 17 August 2010 (has links) Apart from contributing to Computer Science, this research also contributes to Bioinformatics, a subset of the subject discipline Computational Biology. The main focus of this dissertation is the development of a data-analytical and theoretical algorithm to contribute to the analysis of DNA, and in particular, to detect microsatellites. Microsatellites, considered in the context of this dissertation, refer to consecutive patterns contained by genomic sequences. A perfect tandem repeat is defined as a string of nucleotides which is repeated at least twice in a sequence. An approximate tandem repeat is a string of nucleotides repeated consecutively at least twice, with small differences between the instances. The research presented in this dissertation was inspired by molecular biologists who were discovered to be visually scanning genetic sequences in search of short approximate tandem repeats or so called microsatellites. The aim of this dissertation is to present three algorithms that search for short approximate tandem repeats. The algorithms comprise the implementation of finite automata. Thus the hypothesis posed is as follows: Finite automata can detect microsatellites effectively in DNA. "Effectively" includes the ability to fine-tune the detection process so that redundant data is avoided, and relevant data is not missed during search. In order to verify whether the hypothesis holds, three theoretical related algorithms have been proposed based on theorems from finite automaton theory. They are generically referred to as the FireìSat algorithms. These algorithms have been implemented, and the performance of FireìSat2 has been investigated and compared to other software packages. From the results obtained, it is clear that the performance of these algorithms differ in terms of attributes such as speed, memory consumption and extensibility. In respect of speed performance, FireìSat outperformed rival software packages. It will be seen that the FireìSat algorithms have several parameters that can be used to tune their search. It should be emphasized that these parameters have been devised in consultation with the intended user community, in order to enhance the usability of the software. It was found that the parameters of FireìSat can be set to detect more tandem repeats than rival software packages, but also tuned to limit the number of detected tandem repeats. Copyright / Dissertation (MSc)--University of Pretoria, 2010. / Computer Science / unrestricted Approximate tandem repeats Regular expression Finite automata Microsatellites UCTD
9	Characterizing VNTRs in human populations Eslami Rasekh, Marzieh 04 October 2021 (has links) Over half the human genome consists of repetitive sequences. One major class is the tandem repeats (TRs), which are defined by their location in the genome, repeat unit, and copy number. TRs loci that exhibit variant copy numbers are called Variable Number Tandem Repeats (VNTRs). High VNTR mutation rates of approximately 0.0001 per generation make them suitable for forensic studies, and of interest for potential roles in gene regulation and disease. TRs are generally divided into three classes: 1) microsatellites or short tandem repeats (STRs) with patterns <7 bp; 2) minisatellites with patterns of seven to hundreds of base pairs; and 3) macrosatellites with patterns of >100 bp. To date, mini- and macrosatellites have been poorly characterized, mainly due to a lack of computational tools. In this thesis, I utilize a tool, VNTRseek, to identify human minisatellite VNTRs using short-read sequencing data from nearly 2,800 individuals and developed a new computational tool, MaSUD, to identify human macrosatellite VNTRs using data from 2,504 individuals. MaSUD is the first high-throughput tool to genotype macrosatellites using short reads. I identified over 35,000 minisatellite VNTRs and over 4,000 macrosatellite VNTRs, most previously unknown. A small subset in each VNTR class was validated experimentally and in silico. The detected VNTRs were further studied for their effects on gene expression, ability to distinguish human populations, and functional enrichment. Unlike STRs, mini- and macrosatellite VNTRs are enriched in regions with functional importance, e.g., introns, promoters, and transcription factor binding sites. A study of VNTRs across 26 populations shows that minisatellite VNTR genotypes can be used to predict super-populations with >90% accuracy. In addition, genotypes for 195 minisatellite VNTRs and 22 macrosatellite VNTRs were shown to be associated with differential expression in nearby genes (eQTLs). Finally, I developed a computational tool, mlZ, to infer undetected VNTR alleles and to detect false positive predictions. mlZ is applicable to other tools that use read support for predicting short variants. Overall, these studies provide the most comprehensive analysis of mini- and macrosatellites in human populations and will facilitate the application of VNTRs for clinical purposes. Bioinformatics Genomic variation Genotyping Macrosatellite Minisatellite Tandem repeats VNTR
10	Genome-wide analysis of transcriptome dynamics in plants and algae Zhao, Zhixin 04 December 2013 (has links) No description available. Bioinformatics Polyadenylation splicing plants algae tandem repeats phytozome

Search results