• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 84
  • 17
  • 6
  • 6
  • 5
  • 4
  • 2
  • Tagged with
  • 140
  • 140
  • 43
  • 35
  • 24
  • 23
  • 17
  • 15
  • 15
  • 15
  • 14
  • 13
  • 13
  • 13
  • 11
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
111

Identification and Expression Analysis of Zebrafish Glypicans during Embryonic Development

Brand, Michael, Gupta, Mansi 02 December 2015 (has links) (PDF)
Heparan sulfate Proteoglycans (HSPG) are ubiquitous molecules with indispensable functions in various biological processes. Glypicans are a family of HSPG’s, characterized by a Gpi-anchor which directs them to the cell surface and/or extracellular matrix where they regulate growth factor signaling during development and disease. We report the identification and expression pattern of glypican genes from zebrafish. The zebrafish genome contains 10 glypican homologs, as opposed to six in mammals, which are highly conserved and are phylogenetically related to the mammalian genes. Some of the fish glypicans like Gpc1a, Gpc3, Gpc4, Gpc6a and Gpc6b show conserved synteny with their mammalian cognate genes. Many glypicans are expressed during the gastrulation stage, but their expression becomes more tissue specific and defined during somitogenesis stages, particularly in the developing central nervous system. Existence of multiple glypican orthologs in fish with diverse expression pattern suggests highly specialized and/or redundant function of these genes during embryonic development.
112

Role of mutual information for predicting contact residues in proteins

Gomes, Mireille January 2012 (has links)
Mutual Information (MI) based methods are used to predict contact residues within proteins and between interacting proteins. There have been many high impact papers citing the successful use of MI for determining contact residues in a particular protein of interest, or in certain types of proteins, such as homotrimers. In this dissertation we have carried out a systematic study to assess if this popularly employed contact prediction tool is useful on a global scale. After testing original MI and leading MI based methods on large, cross-species datasets we found that in general the performance of these methods for predicting contact residues both within (intra-protein) and between proteins (inter-protein) is weak. We observe that all MI variants have a bias towards surface residues, and therefore predict surface residues instead of contact residues. This finding is in contrast to the relatively good performance of i-Patch (Hamer et al. [2010]), a statistical scoring tool for inter-protein contact prediction. i-Patch uses as input surface residues only, groups amino acids by physiochemical properties, and assumes the existence of patches of contact residues on interacting proteins. We examine whether using these ideas would improve the performance of MI. Since inter-protein contact residues are only on the surface of each protein, to disentangle surface from contact prediction we filtered out the confounding buried residues. We observed that considering surface residues only does indeed improve the interprotein contact prediction ability of all tested MI methods. We examined a specific "successful" case study in the literature and demonstrated that here, even when considering surface residues only, the most accurate MI based inter-protein contact predictor,MIc, performs no better than random. We have developed two novel MI variants; the first groups amino acids by their physiochemical properties, and the second considers patches of residues on the interacting proteins. In our analyses these new variants highlight the delicate trade-off between signal and noise that must be achieved when using MI for inter-protein contact prediction. The input for all tested MI methods is a multiple sequence alignment of homologous proteins. In a further attempt to understand why the MI methods perform poorly, we have investigated the influence of gaps in the alignment on intra-protein contact prediction. Our results suggest that depending on the evaluation criteria and the alignment construction algorithm employed, a gap cutoff of around 10% would maximise the performance of MI methods, whereas the popularly employed 0% gap cutoff may lead to predictions that are no better than random guesses. Based on the insight we have gained through our analyses, we end this dissertation by identifying a number of ways in which the contact residue prediction ability of MI variants may be improved, including direct coupling analysis.
113

Alinhamento de seqüências biológicas em arquiteturas com memória distribuída

Peranconi, Daniela Saccol 04 March 2005 (has links)
Made available in DSpace on 2015-03-05T13:53:44Z (GMT). No. of bitstreams: 0 Previous issue date: 4 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / A utilização de aglomerados de computadores na solução de problemas que demandam grande quantidade de recursos computacionais vem se mostrando uma alternativa interessante. aglomerados são economicamente viáveis e de fácil manutenção, oferecendo poder computacional equivalente ao de supercomputadores. No entanto, o desenvolvimento de aplicações para este tipo de arquitetura é complexo, uma vez que envolve questões não presentes na programação seqüencial, como a comunicação de dados e a sincronização de tarefas concorrentes, problemas estes que, em geral, são tratados em supercomputadores por pacotes de software especializados. Neste contexto, este trabalho apresenta o desenvolvimento de um mecanismo de suporte à comunicação sobre aglomerados de computadores, focado na exploração desta plataforma de hardware para o processamento de alto desempenho. O mecanismo criado e disponibilizado sob a forma de uma biblioteca de funções em C, é baseado no modelo de Mensagens Ativas. Sua implementação é realizada na cama / The use of cluster of computers for solving problems that require a great quantity of computational resources is becoming an interesting alternative. Clusters are economically feasible and of easy maintenance, offering a computational power equivalent to that of supercomputers. However developing applications for this kind of architecture is complex because it involves issues that are not present in the sequential programming such as data communication and concurrent tasks synchronization, problems that usually are handled by specialized software packages in supercomputers. Considering this context, this work presents the development of a mechanism for supporting communication on clusters of computers focused on exploring this hardware platform for high performance processing. The mechanism was created as a library of functions written in C and it is based on the Active Messages model. Its implementation was performed on the applicative level, using light multiprogramming techniques as programming resou
114

Graphical representation of biological sequences and its applications. / CUHK electronic theses & dissertations collection / Digital dissertation consortium

January 2010 (has links)
Among all existing alignment-free methods for comparing biological sequences, the sequence graphical representation provides a simple approach to view, sort, and compare gene structures. The aim of graphical representation is to display DNA or protein sequences graphically so that we can easily find out visually how similar or how different they are. Of course, only the visual comparison of sequences is not enough for the follow-up research work. We need more accurate comparison. This leads us to develop the application of the graphical representation for biological sequences. / In this thesis, we have two main contributions: (1) We construct a protein map with the help of our proposed new graphical representation for protein sequences. Each protein sequence can be represented as a point in this map, and cluster analysis of proteins can be performed for comparison between the points. This protein map can be used to mathematically specify the similarity of two proteins and predict properties of an unknown protein based on its amino acid sequence. (2) We construct a novel genome space with biological geometry, which is a subspace in RN . In this space each point corresponds to a genome. The natural distance between two points in the genome space reflects the biological distance between these two genomes. Our genome space will provide a new powerful tool for analyzing the classification of genomes and their phylogenetic relationships. / Yu, Chenglong. / Adviser: Luk Hing Sun. / Source: Dissertation Abstracts International, Volume: 72-04, Section: B, page: . / Thesis (Ph.D.)--Chinese University of Hong Kong, 2010. / Includes bibliographical references (leaves 59-64). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Electronic reproduction. Ann Arbor, MI : ProQuest Information and Learning Company, [200-] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese.
115

Discovery Of Application Workloads From Network File Traces

Yadwadkar, Neeraja 12 1900 (has links) (PDF)
An understanding of Input/Output data access patterns of applications is useful in several situations. First, gaining an insight into what applications are doing with their data at a semantic level helps in designing efficient storage systems. Second, it helps to create benchmarks that mimic realistic application behavior closely. Third, it enables autonomic systems as the information obtained can be used to adapt the system in a closed loop. All these use cases require the ability to extract the application-level semantics of I/O operations. Methods such as modifying application code to associate I/O operations with semantic tags are intrusive. It is well known that network file system traces are an important source of information that can be obtained non-intrusively and analyzed either online or offline. These traces are a sequence of primitive file system operations and their parameters. Simple counting, statistical analysis or deterministic search techniques are inadequate for discovering application-level semantics in the general case, because of the inherent variation and noise in realistic traces. In this paper, we describe a trace analysis methodology based on Profile Hidden Markov Models. We show that the methodology has powerful discriminatory capabilities that enables it to recognize applications based on the patterns in the traces, and to mark out regions in a long trace that encapsulate sets of primitive operations that represent higher-level application actions. It is robust enough that it can work around discrepancies between training and target traces such as in length and interleaving with other operations. We demonstrate the feasibility of recognizing patterns based on a small sampling of the trace, enabling faster trace analysis. Preliminary experiments show that the method is capable of learning accurate profile models on live traces in an online setting. We present a detailed evaluation of this methodology in a UNIX environment using NFS traces of selected commonly used applications such as compilations as well as on industrial strength benchmarks such as TPC-C and Postmark, and discuss its capabilities and limitations in the context of the use cases mentioned above.
116

Enhance the understanding of whole-genome evolution by designing, accelerating and parallelizing phylogenetic algorithms

Yin, Zhaoming 22 May 2014 (has links)
The advent of new technology enhance the speed and reduce the cost for sequencing biological data. Making biological sense of this genomic data is a big challenge to the algorithm design as well as the high performance computing society. There are many problems in Bioinformatics, such as how new functional genes arise, why genes are organized into chromosomes, how species are connected through the evolutionary tree of life, or why arrangements are subject to change. Phylogenetic analyses have become essential to research on the evolutionary tree of life. It can help us to track the history of species and the relationship between different genes or genomes through millions of years. One of the fundamentals for phylogenetic construction is the computation of distances between genomes. Since there are much more complicated combinatoric patterns in rearrangement events, the distance computation is still a hot topic as much belongs to mathematics as to biology. For the distance computation with input of two genomes containing unequal gene contents (with insertions/deletions and duplications) the problem is especially hard. In this thesis, we will discuss about our contributions to the distance estimation for unequal gene order data. The problem of finding the median of three genomes is the key process in building the most parsimonious phylogenetic trees from genome rearrangement data. For genomes with unequal contents, to the best of our knowledge, there is no algorithm that can help to find the median. In this thesis, we make our contributions to the median computation in two aspects. 1) Algorithm engineering aspect, we harness the power of streaming graph analytics methods to implement an exact DCJ median algorithm which run as fast as the heuristic algorithm and can help construct a better phylogenetic tree. 2) Algorithmic aspect, we theoretically formulate the problem of finding median with input of genomes having unequal gene content, which leads to the design and implementation of an efficient Lin-Kernighan heuristic based median algorithm. Inferring phylogenies (evolutionary history) of a set of given species is the ultimate goal when the distance and median model are chosen. For more than a decade, biologists and computer scientists have studied how to infer phylogenies by the measurement of genome rearrangement events using gene order data. While evolution is not an inherently parsimonious process, maximum parsimony (MP) phylogenetic analysis has been supported by widely applied to the phylogeny inference to study the evolutionary patterns of genome rearrangements. There are generally two problems with the MP phylogenetic arose by genome rearrangement: One is, given a set of modern genomes, how to compute the topologies of the according phylogenetic tree; Another is, given the topology of a model tree, how to infer the gene orders of the ancestor species. To assemble a MP phylogenetic tree constructor, there are multiple NP hard problems involved, unfortunately, they organized as one problem on top of other problems. Which means, to solve a NP hard problem, we need to solve multiple NP hard sub-problems. For phylogenetic tree construction with the input of unequal content genomes, there are three layers of NP hard problems. In this thesis, we will mainly discuss about our contributions to the design and implementation of the software package DCJUC (Phylogeny Inference using DCJ model to cope with Unequal Content Genomes), that can help to achieve both of these two goals. Aside from the biological problems, another issue we need to concern is about the use of the power of parallel computing to assist accelerating algorithms to handle huge data sets, such as the high resolution gene order data. For one thing, all of the method to tackle with phylogenetic problems are based on branch and bound algorithms, which are quite irregular and unfriendly to parallel computing. To parallelize these algorithms, we need to properly enhance the efficiency for localized memory access and load balance methods to make sure that each thread can put their potentials into full play. For the other, there is a revolution taking place in computing with the availability of commodity graphical processors such as Nvidia GPU and with many-core CPUs such as Cray-XMT, or Intel Xeon Phi Coprocessor with 60 cores. These architectures provide a new way for us to achieve high performance at much lower cost. However, code running on these machines are not so easily programmed, and scientific computing is hard to tune well on them. We try to explore the potentials of these architectures to help us accelerate branch and bound based phylogenetic algorithms.
117

Tracking the evolution of function in diverse enzyme superfamilies

Alderson, Rosanna Grace January 2016 (has links)
Tracking the evolution of function in enzyme superfamilies is key in understanding how important biological functions and mechanisms have evolved. New genes are being sequenced at a rate that far surpasses the ability of characterization by wet-lab techniques. Moreover, bioinformatics allows for the use of methods not amenable to wet lab experimentation. We now face a situation in which we are aware of the existence of many gene families but are ignorant of what they do and how they function. Even for families with many structurally and functionally characterized members, the prediction of function of ancestral sequences can be used to elucidate past patterns of evolution and highlight likely future trajectories. In this thesis, we apply in silico structure and function methods to predict the functions of protein sequences from two diverse superfamily case studies. In the first, the metallo-β-lactamase superfamily, many members have been structurally and functionally characterised. In this work, we asked how many times the same function has independently evolved in the same superfamily using ancestral sequence reconstruction, homology modelling and alignment to catalytic templates. We found that in only 5% of evolutionary scenarios assessed, was there evidence of a lactam hydrolysing ancestor. This could be taken as strong evidence that metallo-β-lactamase function has evolved independently on multiple occasions. This finding has important implications for predicting the evolution of antibiotic resistance in this protein fold. However, as discussed, the interpretation of this statistic is not clear-cut. In the second case study, we analysed protein sequences of the DUF-62 superfamily. In contrast to the metallo-β-lactmase superfamily, very few members of this superfamily have been structurally and functionally characterised. We used the analysis of alignment, gene context, species tree reconciliation and comparison of the rates of evolution to ask if other functions or cellular roles might exist in this family other than the ones already established. We find that multiple lines of evidence present a compelling case for the evolution of different functions within the Archaea, and propose possible cellular interactions and roles for members of this enzyme family.
118

Uma abordagem para linha de produtos de software científico baseada em ontologia e workflow

Costa, Gabriella Castro Barbosa 27 February 2013 (has links)
Submitted by Renata Lopes (renatasil82@gmail.com) on 2017-05-31T17:53:13Z No. of bitstreams: 1 gabriellacastrobarbosacosta.pdf: 2243060 bytes, checksum: 0aef87199975808e0973490875ce39b5 (MD5) / Approved for entry into archive by Adriana Oliveira (adriana.oliveira@ufjf.edu.br) on 2017-06-01T11:50:00Z (GMT) No. of bitstreams: 1 gabriellacastrobarbosacosta.pdf: 2243060 bytes, checksum: 0aef87199975808e0973490875ce39b5 (MD5) / Made available in DSpace on 2017-06-01T11:50:00Z (GMT). No. of bitstreams: 1 gabriellacastrobarbosacosta.pdf: 2243060 bytes, checksum: 0aef87199975808e0973490875ce39b5 (MD5) Previous issue date: 2013-02-27 / CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Uma forma de aprimorar a reutilização e a manutenção de uma família de produtos de software é através da utilização de uma abordagem de Linha de Produtos de Software (LPS). Em algumas situações, tais como aplicações científicas para uma determinada área, é vantajoso desenvolver uma coleção de produtos de software relacionados, utilizando uma abordagem de LPS. Linhas de Produtos de Software Científico (LPSC) diferem-se de Li nhas de Produtos de Software pelo fato de que LPSC fazem uso de um modelo abstrato de workflow científico. Esse modelo abstrato de workflow é definido de acordo com o domínio científico e, através deste workflow, os produtos da LPSC serão instanciados. Analisando as dificuldades em especificar experimentos científicos e considerando a necessidade de composição de aplicações científicas para a sua implementação, constata-se a necessidade de um suporte semântico mais adequado para a fase de análise de domínio. Para tanto, este trabalho propõe uma abordagem baseada na associação de modelo de features e onto logias, denominada PL-Science, para apoiar a especificação e a condução de experimentos científicos. A abordagem PL-Science, que considera o contexto de LPSC, visa auxiliar os cientistas através de um workflow que engloba as aplicações científicas de um dado experimento. Usando os conceitos de LPS, os cientistas podem reutilizar modelos que especificam a LPSC e tomar decisões de acordo com suas necessidades. Este trabalho enfatiza o uso de ontologias para facilitar o processo de aplicação de LPS em domínios científicos. Através do uso de ontologia como um modelo de domínio consegue-se fornecer informações adicionais, bem como adicionar mais semântica ao contexto de LPSC. / A way to improve reusability and maintainability of a family of software products is through the Software Product Line (SPL) approach. In some situations, such as scientific applications for a given area, it is advantageous to develop a collection of related software products, using an SPL approach. Scientific Software Product Lines (SSPL) differs from the Software Product Lines due to the fact that SSPL uses an abstract scientific workflow model. This workflow is defined according to the scientific domain and, using this abstract workflow model, the products will be instantiated. Analyzing the difficulties to specify scientific experiments, and considering the need for scientific applications composition for its implementation, an appropriated semantic support for the domain analysis phase is necessary. Therefore, this work proposes an approach based on the combination of feature models and ontologies, named PL-Science, to support the specification and conduction of scientific experiments. The PL-Science approach, which considers the context of SPL and aims to assist scientists to define a scientific experiment, specifying a workflow that encompasses scientific applications of a given experiment, is presented during this disser tation. Using SPL concepts, scientists can reuse models that specify the scientific product line and carefully make decisions according to their needs. This work also focuses on the use of ontologies to facilitate the process of applying Software Product Line to scientific domains. Through the use of ontology as a domain model, we can provide additional information as well as add more semantics in the context of Scientific Software Product Lines.
119

Identification de la source de défaut dans une ligne de production du semiconducteur

Chakaroun, Mohamad 29 June 2015 (has links)
Un système de production High-Mix Low-Volume est caractérisé par une grande variété de technologies, des faibles volumes de production, et des produits de courte durée de vie. L’introduction de la technique d’échantillonnage dynamique à ce système de production a permis un gain important sur le rendement de production. Cet échantillonnage est basé, en temps réel, sur les états des équipements et sur l’ensemble des produits en cours de fabrication. Les méthodes classiques d’analyse des rendements, nécessitant un grand nombre de mesure par produit, ne sont plus aussi performantes. Afin d’adapter le diagnostic au nouvel environnement de production, les travaux de cette thèse proposent une approche de diagnostic qui consiste à localiser l’équipement à l’origine de défauts dans une ligne de fabrication du semi-conducteur. Elle est composée de trois modules principaux. Le premier module est constitué d’une méthode d’identification de l’équipement en mode de fonctionnement anormal. Cette méthode est basée sur l’analyse d’éléments communs. Le deuxième est un module de tri de données. Un algorithme d’alignement de séquences a été utilisé afin de comparer les caractéristiques des échantillons et calculer le taux de similarité. Le troisième module est l’échantillonnage réactif pour le diagnostic. Cet échantillonnage est basé sur un modèle d’optimisation linéaire qui permet de trouver l’équilibre entre le nombre d’échantillons et le temps d’analyse. L’approche proposée est validée sur des données expérimentales issues de la ligne de fabrication de la compagnie STMicroelectronics à Rousset-France. / High-Mix Low-Volume manufacturing process is characterized by a wide variety of technologies, low production volumes, and short cycle time of products. The introduction of dynamic sampling technique in this system has enabled a significant improvement of production gain. The dynamic sampling is based on the equipment states and the set of products being manufactured. The yield enhancement methods requiring à large number of measurements by product, are no more efficient. In order to adapt the diagnosis method to the new manufacturing environment, this thesis provides a defect source identification method applied to semiconductor manufacturing process. It is composed of three main modules. The first module aim to identify the faulty process equipment. This module is based on the tool commonality analysis approach. The second module consists in sorting the products. A Sequence Alignment Algorithm has been used in order to compare the sample characteristics and to calculate the similarity degree. The third module is a reactive sampling method for the diagnosis. This method is based on a linear optimization algorithm that allows finding the tradeoff between the number of samples and the analysis time. The proposed approach has been approved on real data from STMicroelectronics manufacturing line in Rousset-France.
120

Identification and Expression Analysis of Zebrafish Glypicans during Embryonic Development

Brand, Michael, Gupta, Mansi 02 December 2015 (has links)
Heparan sulfate Proteoglycans (HSPG) are ubiquitous molecules with indispensable functions in various biological processes. Glypicans are a family of HSPG’s, characterized by a Gpi-anchor which directs them to the cell surface and/or extracellular matrix where they regulate growth factor signaling during development and disease. We report the identification and expression pattern of glypican genes from zebrafish. The zebrafish genome contains 10 glypican homologs, as opposed to six in mammals, which are highly conserved and are phylogenetically related to the mammalian genes. Some of the fish glypicans like Gpc1a, Gpc3, Gpc4, Gpc6a and Gpc6b show conserved synteny with their mammalian cognate genes. Many glypicans are expressed during the gastrulation stage, but their expression becomes more tissue specific and defined during somitogenesis stages, particularly in the developing central nervous system. Existence of multiple glypican orthologs in fish with diverse expression pattern suggests highly specialized and/or redundant function of these genes during embryonic development.

Page generated in 0.111 seconds