Spelling suggestions: "subject:"pairwise alignment"" "subject:"fairwise alignment""
1 |
Improvements in the Accuracy of Pairwise Genomic AlignmentHudek, Alexander Karl January 2010 (has links)
Pairwise sequence alignment is a fundamental problem in bioinformatics with wide applicability. This thesis presents three new algorithms for this well-studied problem. First, we present a new algorithm, RDA, which aligns sequences in small segments, rather than by individual bases. Then, we present two algorithms for aligning long genomic sequences: CAPE, a pairwise global aligner, and FEAST, a pairwise local aligner.
RDA produces interesting alignments that can be substantially different in structure than traditional alignments. It is also better than traditional alignment at the task of homology detection. However, its main negative is a very slow run time. Further, although it produces alignments with different structure, it is not clear if the differences have a practical value in genomic research.
Our main success comes from our local aligner, FEAST. We describe two main improvements: a new more descriptive model of evolution, and a new local extension algorithm that considers all possible evolutionary histories rather than only the most likely. Our new model of evolution provides for improved alignment accuracy, and substantially improved parameter training. In particular, we produce a new parameter set for aligning human and mouse sequences that properly describes regions of weak similarity and regions of strong similarity. The second result is our new extension algorithm. Depending on heuristic settings, our new algorithm can provide for more sensitivity than existing extension algorithms, more specificity, or a combination of the two.
By comparing to CAPE, our global aligner, we find that the sensitivity increase provided by our local extension algorithm is so substantial that it outperforms CAPE on sequence with 0.9 or more expected substitutions per site. CAPE itself gives improved sensitivity for sequence with 0.7 or more expected substitutions per site, but at a great run time cost. FEAST and our local extension algorithm improves on this too, the run time is only slightly slower than existing local alignment algorithms and asymptotically the same.
|
2 |
Improvements in the Accuracy of Pairwise Genomic AlignmentHudek, Alexander Karl January 2010 (has links)
Pairwise sequence alignment is a fundamental problem in bioinformatics with wide applicability. This thesis presents three new algorithms for this well-studied problem. First, we present a new algorithm, RDA, which aligns sequences in small segments, rather than by individual bases. Then, we present two algorithms for aligning long genomic sequences: CAPE, a pairwise global aligner, and FEAST, a pairwise local aligner.
RDA produces interesting alignments that can be substantially different in structure than traditional alignments. It is also better than traditional alignment at the task of homology detection. However, its main negative is a very slow run time. Further, although it produces alignments with different structure, it is not clear if the differences have a practical value in genomic research.
Our main success comes from our local aligner, FEAST. We describe two main improvements: a new more descriptive model of evolution, and a new local extension algorithm that considers all possible evolutionary histories rather than only the most likely. Our new model of evolution provides for improved alignment accuracy, and substantially improved parameter training. In particular, we produce a new parameter set for aligning human and mouse sequences that properly describes regions of weak similarity and regions of strong similarity. The second result is our new extension algorithm. Depending on heuristic settings, our new algorithm can provide for more sensitivity than existing extension algorithms, more specificity, or a combination of the two.
By comparing to CAPE, our global aligner, we find that the sensitivity increase provided by our local extension algorithm is so substantial that it outperforms CAPE on sequence with 0.9 or more expected substitutions per site. CAPE itself gives improved sensitivity for sequence with 0.7 or more expected substitutions per site, but at a great run time cost. FEAST and our local extension algorithm improves on this too, the run time is only slightly slower than existing local alignment algorithms and asymptotically the same.
|
3 |
Parallel Algorithm for Memory Efficient Pairwise and Multiple Genome Alignment in Distributed EnvironmentAhmed, Nova 20 December 2004 (has links)
The genome sequence alignment problems are very important ones from the computational biology perspective. These problems deal with large amount of data which is memory intensive as well as computation intensive. In the literature, two separate algorithms have been studied and improved – one is a Pairwise sequence alignment algorithm which aligns pairs of genome sequences with memory reduction and parallelism for the computation and the other one is the multiple sequence alignment algorithm that aligns multiple genome sequences and this algorithm is also parallelized efficiently so that the workload of the alignment program is well distributed. The parallel applications can be launched on different environments where shared memory is very well suited for these kinds of applications. But shared memory environment has the limitation of memory usage as well as scalability also these machines are very costly. A better approach is to use the cluster of computers and the cluster environment can be further enhanced to a grid environment so that the scalability can be improved introducing multiple clusters. Here the grid environment is studied as well as the shared memory and cluster environment for the two applications. It can be stated that for carefully designed algorithms the grid environment is comparable for its performance to other distributed environments and it sometimes outperforms the others in terms of the limitations of resources the other distributed environments have.
|
4 |
Alinhamento de seqüências biológicas com o uso de algoritmos genéticos.Ogata, Sabrina Oliveira 14 March 2005 (has links)
Made available in DSpace on 2016-06-02T20:21:33Z (GMT). No. of bitstreams: 1
DissSOO.pdf: 370558 bytes, checksum: eaf0bc7ee24eeffcd23041e4273aa013 (MD5)
Previous issue date: 2005-03-14 / Universidade Federal de Sao Carlos / The comparison of genome sequences from different organisms is one
of the computational application most frequently used by molecular
biologists. This operation serves as support for other processes as, for
instance, the determination of the three- dimensional structure of the
proteins. During the last years, a significative increase has been observed
in the use of Genetic Algorithms (GA) for optimization problems as, for
example, dna or protein sequence alignment. GAs are search techniques
inspired in mechanisms from Natural Selection and Genetics. In this work,
a GA was used to development a computacional tool for the sequence
alignment problem. A benchmark comparison of this program with the
bl2seq tool (from Blast package) is showed using some sequences, varying
in similarity and length. In some cases the GA overhelmed the bl2seq tool.
This work contributes to the Bioinformatic field by the use of a novel
methodology that can use as an option for sequence analysis in genome
projects and to enhance the results of other tools such like Blast. / A comparação de seqüências de genomas de organismos é uma das
aplicações computacionais mais utilizadas por biólogos moleculares. Esta
operação serve de suporte para outros processos como, por exemplo, a
determinação da estrutura tridimensional de proteínas. Durante os últimos
anos, tem- se observado uma grande expansão na utilização de Algoritmos
Evolutivos, especialmente Algoritmos Genéticos (AGs), para problemas de
otimização, como os de alinhamento de seqüências de dna e proteínas. Os
AGs são técnicas de busca inspiradas em mecanismos de seleção natural
e da genética. Neste trabalho, um AG foi utilizado para o desenvolvimento
de uma ferramenta computacional para o problema de alinhamento de
pares de seqüências. Uma comparação foi realizada com o uso da
ferramenta bl2seq (do pacote de ferramentas Blast) afim de se checar a
desempenho do AG. Utilizou- se para isso seqüências de diversos tamanhos
e graus de similaridade. Os resultados demonstraram alguns casos
específicos onde o AG superou a ferramenta bl2seq. A principal
contribuição deste trabalho está na área de Bioinformática, com o
emprego de uma nova metodologia, que posteriormente possa ser
utilizada como mais uma opção para análises de seqüências em projetos
genomas e para refinamento dos resultados obtidos por outras
ferramentas usualmente utilizadas, como o Blast.
|
5 |
Metody vícenásobného zarovnávání nukleotidových sekvencí / Methods for multialignment of nucleotide sequencesTrněný, Ondřej January 2013 (has links)
To be able to understand characteristics and purpose of biological sequences correctly, it is crucial to have a possibility to sort and compare them. Because of this need and to extend existing knowledge pool, numerous methods were proposed. Especially in field of multiple sequences alignment. Methods for multiple sequences alignment may provide various valuable information about sequences which failed to show enough similarity in pairwise alignment. According to this, several algorithms were implemented in various computer applications which provide a way to analyse huge sets of data. One of those, the progressive alignment algorithm, is implemented as a part of this thesis
|
6 |
Molekulární evoluce meiózy u diploidů a tetraploidů druhu Arabidopsis arenosa / Molecular evolution of meiosis in diploids and tetraploids of Arabidopsis arenosaHolcová, Magdalena January 2017 (has links)
Meiosis is functionally conserved across eukaryotes, thus not expected to vary considerably among different species, and even less so among lineages within a species. However, recent studies showed that this is not necessarily the case in Arabidopsis arenosa. Genome scanning identified an excess differentiation in meiosis genes between A. arenosa diploids and tetraploids, interpreted as meiosis adaptation to the whole genome duplication in tetraploids and differentiation was also found between two diploid lineages. Thus, I present a population-based analysis of positive selection acting on meiosis proteins across multiple lineages of A. arenosa. I showed that meiosis proteins were under positive selection in all diploid lineages, mainly in the Pannonian and South-eastern Carpathian lineage. The evidence for positive selection in diploid lineages suggested differential pathways of meiosis adaptations in the species, probably reflecting the necessity to adapt to local environments, among all to temperature. The highest enrichment of amino acid substitutions (AASs) under positive selection was identified in tetraploids, in consistence with previous genome-scan results. As several interacting meiosis proteins were under positive selection in the same A. arenosa lineage, I hypothesize that the close...
|
7 |
Implementace algoritmu pro hledání podobností DNA řetězců v FPGA / Approximate String Matching Algorithm Implementation in FPGAPařenica, Martin January 2007 (has links)
This paper describes sequence alignment algorithms of nucleotide sequences. There are described pairwise alignment algorithms using database search or dynamic programming. Then in the paper is description of dynamic programming for multiple sequences and algorithm that builds phylogenetic trees. At the end of the first part of the paper is the description of technology FPGA. In the second part that is more practical is described implemntation of the choosen one algorithm. This part includes also examples of some multiple alignments.
|
8 |
Hardwarová akcelerace algoritmu pro hledání podobnosti dvou DNA řetězců / Hardware Acceleration of Algorithms for Approximate String MatchingNosek, Ondřej January 2007 (has links)
Methods for aproximate string matching of various sequences used in bioinformatics are crucial part of development in this branch. Tasks are of very large time complexity and therefore we want create a hardware platform for acceleration of these computations. Goal of this work is to design a generalized architecture based on FPGA technology, which can work with various types of sequences. Designed acceleration card will use especially dynamic algorithms like Needleman-Wunsch and Smith-Waterman.
|
Page generated in 0.0588 seconds