• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 371
  • 47
  • 33
  • 20
  • 17
  • 10
  • 8
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 697
  • 697
  • 362
  • 188
  • 172
  • 106
  • 96
  • 94
  • 89
  • 80
  • 78
  • 78
  • 77
  • 76
  • 73
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
161

An integrative network approach for the study of human disease

Dickerson, Jonathan January 2010 (has links)
Research into human disease has classically been 'bottom-up', focussing on individual genes. However, the emergence of Systems Biology has prompted a more holistic 'top-down' approach to decoding life. Less than a decade since the complete draft of the human genome was published, we are increasingly in a position to model the interacting constituents of a cell and thus understand molecular perturbations. Given biological systems are rarely attributable to individual molecules and linear pathways, we must understand the complex dynamic interplay as cellular components interact, combine, overlap and conflict. The integrative approach afforded by Network Biology provides us with a powerful toolset to understand the vast volumes of omics data. In this thesis, I investigate both infectious disease, specifically HIV infection and heritable disease. HIV, the causative agent of AIDS, represents an extensive perturbation of the host system and results in hijacking of cellular proteins to replicate. I first introduce the HIV-interaction data and then characterise HIV's hijack, revealing the ways Network Biology can greatly enhance our understanding of host-pathogen systems and ultimately the systems itself. I find a significantly greater propensity for HIV to interact with ''key'' host proteins that are highly connected and represent critical cellular functions. Unexpectedly, however, I find there are no associations between HIV interaction and inferred essentiality and genetic disease-association. I hypothesise that these observations could be the result of ancestral selection pressure on retroviruses to minimise interactions with phenotypically crucial proteins. Investigating inherited disease, I apply a similar integrative approach to determine the relationships between inherited disease, evolution and function. I find that 'disease' genes are not a homogenous group, and that their emergence has been ongoing throughout the evolution of life; contradicting previous studies. Finally, I consider the consequence of bias in literature-curated interaction datasets. I develop a novel method to identify and correct for ascertainment bias and demonstrate that failure to do this weakens conclusions. correct for ascertainment bias and demonstrate that failure to do this weakens conclusions. The aim of this thesis has been to explore the ways Network Biology can provide an integrative biological approach to studying infectious and inherited disease. Given billions of people around the world are susceptible to disease, it is ultimately hoped that a Systems Biology approach to understanding disease will herald new pharmaceutical interventions.
162

Synthesising executable gene regulatory networks in haematopoiesis from single-cell gene expression data

Woodhouse, Steven January 2017 (has links)
A fundamental challenge in biology is to understand the complex gene regulatory networks which control tissue development in the mammalian embryo, and maintain homoeostasis in the adult. The cell fate decisions underlying these processes are ultimately made at the level of individual cells. Recent experimental advances in biology allow researchers to obtain gene expression profiles at single-cell resolution over thousands of cells at once. These single-cell measurements provide snapshots of the states of the cells that make up a tissue, instead of the population-level averages provided by conventional high-throughput experiments. The aim of this PhD was to investigate the possibility of using this new high resolution data to reconstruct mechanistic computational models of gene regulatory networks. In this thesis I introduce the idea of viewing single-cell gene expression profiles as states of an asynchronous Boolean network, and frame model inference as the problem of reconstructing a Boolean network from its state space. I then give a scalable algorithm to solve this synthesis problem. In order to achieve scalability, this algorithm works in a modular way, treating different aspects of a graph data structure separately before encoding the search for logical rules as Boolean satisfiability problems to be dispatched to a SAT solver. Together with experimental collaborators, I applied this method to understanding the process of early blood development in the embryo, which is poorly understood due to the small number of cells present at this stage. The emergence of blood from Flk1+ mesoderm was studied by single cell expression analysis of 3934 cells at four sequential developmental time points. A mechanistic model recapitulating blood development was reconstructed from this data set, which was consistent with known biology and the bifurcation of blood and endothelium. Several model predictions were validated experimentally, demonstrating that HoxB4 and Sox17 directly regulate the haematopoietic factor Erg, and that Sox7 blocks primitive erythroid development. A general-purpose graphical tool was then developed based on this algorithm, which can be used by biological researchers as new single-cell data sets become available. This tool can deploy computations to the cloud in order to scale up larger high-throughput data sets. The results in this thesis demonstrate that single-cell analysis of a developing organ coupled with computational approaches can reveal the gene regulatory networks that underpin organogenesis. Rapid technological advances in our ability to perform single-cell profiling suggest that my tool will be applicable to other organ systems and may inform the development of improved cellular programming strategies.
163

Phenotyping cellular motion

Zhou, Felix January 2017 (has links)
In the development of multicellular organisms, tissue development and homeostasis require coordinated cellular motion. For example, in conditions such as wound healing, immune and epithelial cells need to proliferate and migrate. Deregulation of key signalling pathways in pathological conditions causes alterations in cellular motion properties that are critical for disease development and progression, in cancer it leads to invasion and metastasis. Consequently there is strong interest in identifying factors, including drugs that affect the motion and interactions of cells in disease using experimental models suitable for high-content screening. There are two main modes of cell migration; individual and collective migration. Currently analysis tools for robust, sensitive and comprehensive motion characterisation in varying experimental conditions for large extended timelapse acquisitions that jointly considers both modes are limited. We have developed a systematic motion analysis framework, Motion Sensing Superpixels (MOSES) to quantitatively capture cellular motion in timelapse microscopy videos suitable for high-content screening. MOSES builds upon established computer vision approaches to deliver a minimal parameter, robust algorithm that can i) extract reliable phenomena-relevant motion metrics, ii) discover spatiotemporal salient motion patterns and iii) facilitate unbiased analysis with little prior knowledge through unique motion 'signatures'. The framework was validated by application to numerous datasets including YouTube videos, zebrafish immunosurveillance and Drosophila embryo development. We demonstrate two extended applications; the analysis of interactions between two epithelial populations in 2D culture using cell lines of the squamous and columnar epithelia from human normal esophagus, Barrett's esophagus and esophageal adenocarcinoma and the automatic monitoring of 3D organoid culture growth captured through label-free phase contrast microscopy. MOSES found unique boundary formation between squamous and columnar cells and could measure subtle changes in boundary formation due to external stimuli. MOSES automatically segments the motion and shape of multiple organoids even if present in the same field of view. Automated analysis of intestinal organoid branching following treatment agrees with independent RNA-seq results.
164

Clustering biological data using a hybrid approach : Composition of clusterings from different features

Keller, Jens January 2008 (has links)
Clustering of data is a well-researched topic in computer sciences. Many approaches have been designed for different tasks. In biology many of these approaches are hierarchical and the result is usually represented in dendrograms, e.g. phylogenetic trees. However, many non-hierarchical clustering algorithms are also well-established in biology. The approach in this thesis is based on such common algorithms. The algorithm which was implemented as part of this thesis uses a non-hierarchical graph clustering algorithm to compute a hierarchical clustering in a top-down fashion. It performs the graph clustering iteratively, with a previously computed cluster as input set. The innovation is that it focuses on another feature of the data in each step and clusters the data according to this feature. Common hierarchical approaches cluster e.g. in biology, a set of genes according to the similarity of their sequences. The clustering then reflects a partitioning of the genes according to their sequence similarity. The approach introduced in this thesis uses many features of the same objects. These features can be various, in biology for instance similarities of the sequences, of gene expression or of motif occurences in the promoter region. As part of this thesis not only the algorithm itself was implemented and evaluated, but a whole software also providing a graphical user interface. The software was implemented as a framework providing the basic functionality with the algorithm as a plug-in extending the framework. The software is meant to be extended in the future, integrating a set of algorithms and analysis tools related to the process of clustering and analysing data not necessarily related to biology. The thesis deals with topics in biology, data mining and software engineering and is divided into six chapters. The first chapter gives an introduction to the task and the biological background. It gives an overview of common clustering approaches and explains the differences between them. Chapter two shows the idea behind the new clustering approach and points out differences and similarities between it and common clustering approaches. The third chapter discusses the aspects concerning the software, including the algorithm. It illustrates the architecture and analyses the clustering algorithm. After the implementation the software was evaluated, which is described in the fourth chapter, pointing out observations made due to the use of the new algorithm. Furthermore this chapter discusses differences and similarities to related clustering algorithms and software. The thesis ends with the last two chapters, namely conclusions and suggestions for future work. Readers who are interested in repeating the experiments which were made as part of this thesis can contact the author via e-mail, to get the relevant data for the evaluation, scripts or source code.
165

Comparação algebrica de genomas : o caso da distancia de reversão / Algebraic genome comparison : the case of reversal distance

Almeida, André Atanasio Maranhão, 1981- 23 February 2007 (has links)
Orientador: João Meidanis / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-08T13:34:53Z (GMT). No. of bitstreams: 1 Almeida_AndreAtanasioMaranhao_M.pdf: 3188069 bytes, checksum: b0743bc208c47e2d263f7d5503c22c07 (MD5) Previous issue date: 2007 / Resumo: Nas últimas décadas presenciamos grandes avanços na biologia molecular que levaram ao acúmulo de um grande volume de dados acerca de moléculas, tais como DNAs e proteínas, essenciais para a vida e para seu entendimento.O estágio atual é de busca por ferramentas que permitam extrair informações com relevância biológica destes dados. Neste contexto, a comparação de genomas surge como uma das ferramentas e nesta categoria incluímos rearranjo de genomas. Em rearranjo, o genoma é representado por uma seqüência de blocos conservados e, dados dois genomas e um conjunto de operações, busca-se pela que transformem um genoma no outro. Em 1995, Hannenhallie Pevzner apresentaram o primeiro algoritmo polinomial para o problema da ordenação por reversões orientadas. Tal algoritmo executa em tempo O(n4) e foi o primeiro algoritmo polinomial para um modelo realístico de rearranjo de genomas. Desde então, surgiram algoritmos que apresentam desempenho assintoticamente melhor. O melhor deles, apresentado por Tannier e Sagot em 2004, é capaz de executar em tempo O (n(n log n)1/2). Há um algoritmo linear, desenvolvido por Bader e colegas[2], mas este capaz de determinar a seqüência de reversões, apenas calcula a distância. Motivado pela carência de uma derivação algébrica mais formal da teoria desenvolvida em rearranjo de genomas, desenvolvemos uma solução formal para o problema da distância de reversão com sinal. Utilizamos, em tal solução, um formalismo algébrico para rearranjo de genomas que relaciona a recente teoria de rearranjo de genomas ?basicamente fundamentada no trabalho de Hannenhalli e Pevzner ? e a teoria de grupos de permutação de uma nova forma. Pretendemos criar a base para grandes avanços na área através de um formalismo algébrico forte / Abstract: In the last decades we have seen a great progress in molecular biology. That lead to a large volume of data on molecules, DNA and proteins, essential for life.The current stage of research lies in the pursuit of tools to extract information with biological relevance from this data. In this context, comparison of genomes is an important tool and genome rearrangements is a way of doing that comparison. In rearrangement analysis the genome is represented by a sequence of conserved blocks. The aim is to ?nd a minimum sequence of operations that transform a genome into another given as input two genomes and a set of allowed operations. In 1995, Hannenhalli and Pevzner presented the ?rst polinomial algorithm for sorting signed permutations by reversals. This algorithm has complexity O(n4) in time and was the ?rst polinomial algorithm for a realistic model of genome rearrangement. Since then, new algorithms with better asintotic performance had appeared. The fastest algorithm, with complexity O(n?n logn), was developed byTannier and Sagot in 2004. Motivated by a lack of a more formal derivation in the genome rearrangement developed theory, we developed a formal solution for the signed reversal distance problem. We use an algebraic formalism that relates the recent genome rearrangement theory ? basically based on a work of Hannenhalli and Pevzner ? to permutation group theory in a new form. We intend to build a solid theoretical base for further advances in the area through strong algebraic formalism / Mestrado / Teoria da Computação / Mestre em Ciência da Computação
166

Comparação de genomas completos de especies da familia Vibrionacea empregando rearranjo de genomas / A rearrangement-based approach to compare whole genomes of Vibrionacea strains

Cogo, Patricia Pilisson 23 February 2007 (has links)
Orientador: João Meidanis / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-11T06:12:01Z (GMT). No. of bitstreams: 1 Cogo_PatriciaPilisson_M.pdf: 1149626 bytes, checksum: 10816aa20a620eb105df492903697347 (MD5) Previous issue date: 2008 / Resumo: A evolução das técnicas de seqüenciamento tornou possível a obtenção de uma enorme quantidade de dados genômicos. O desafio atual é analisar esses dados e construir novos conhecimentos a partir deles. Neste contexto, um problema importante e ainda em aberto é a criação de métodos de análise taxonômica de genomas completos. Especialmente para organismos procariontes, para os quais ainda não há um conceito claro de espécie, a comparação de genomas completos pode significar uma importante contribuição. Neste trabalho propomos uma metodologia para comparação de genomas completos baseada na teoria de Rearranjo de Genomas, aplicando-a a organismos da família Vibrionaceae ¿ uma família heterogênea que compreende organismos de cinco diferentes gêneros, incluindo o vibrião causador da cólera, uma doença grave e que ainda causa anualmente milhares de mortes em países em desenvolvimento. As distâncias genômicas obtidas quando analisamos separadamente cada um dos dois cromossomos que compõem o genoma desses organismos estão de acordo com as árvores filogenéticas construídas empregando outros métodos de comparação genômica. Esse resultado corrobora nosso método e a utilização da teoria de rearranjo de genomas como uma alternativa para análise de genomas completos. Além disso, pode indicar que os eventos modelados neste trabalho, como perda de genes, transferências horizontais, reversões entre outros, exercem um papel importante na evolução desses organismos. Compreender a dinâmica desses eventos e combiná-la a outros métodos de análise genômica pode significar um grande avanço para a construção de uma filogenia mais acurada para estes vibriões / Abstract: The evolution in genomic sequencing techniques has resulted in a large amount of genomic data. The challenge that arises from this scenario is to analyze these data and to extract from them relevant biologic information. In this context, taxonomic analysis of complete genome sequences is still an open problem. Futhermore, it is critical for procaryotes, which still lack a clear definition of species, and whose taxonomic classification is in continuous evolution, where complete genomes comparison may well play a significant role. In this work, we propose a methodology to compare complete genomes based on genome rearrangement theory. We have applied our method to organisms of Vibrionaceae family ¿ a heterogeneous family that comprehends organisms from five different genera, including the agent responsible for cholera, a severe disease in developing countries. The genomic distances obtained when we analysed each chromosome individually are in agreement with phylogenic trees built using other genomic methods. This result validates our method and the genomic rearrangement theory as an alternative to analyze complete genomes. It also can indicate the importance played by rearrangement events in the vibrio genomic evolution. The understanding of these events, combined with other genomic methods, can play an important role in the construction of a robust vibrio phylogeny / Mestrado / Biologia Computaçional / Mestre em Ciência da Computação
167

Timing of chromosomal alterations during tumour development

Viklund, Björn January 2017 (has links)
During cancer development, tumour cells will accumulate a lot of both somatic point mutations and copy number alterations. It is not unusual that affected genes have a copy number that differs from the usual two. Due to the loss of DNA repair mechanisms the cells can mutate independent from each other which gives rise to different subclones within the tumour. A tumour cell and its future daughter cells that gets an advantage in cell division speed compared to its competing neighbours, will eventually make up a large portion of the tumour. All the mutations that the subclone’s most recent common ancestor acquired until the expansion will be shared across the subclone. In this project, we have developed a method using the mutation frequencies from publicly available whole genome sequencing data, to quantify the amount of competing subclones in a sample and determining the time to its copy number duplications. This method could be further developed to be an extension to regular copy number analysis. A heterogeneous tumour can grow faster and be more resistant to treatment. Therefore, it is important to learn more about cancer development and get a greater understanding of the order in which copy number alterations occur.
168

A comparative validation of the human variant simulator SIMdrom

Ånäs, Sofia January 2017 (has links)
The past decade’s progress in next generation sequencing has drastically decreased the price of whole genome and exome sequencing, making it available as a clinical tool for diagnosing patients with genetic disease. However, finding a disease-causing mutation among millions of non-pathogenic variants in a patient’s genome, is not an easy task. Therefore, algorithms for finding variants relevant for clinicians to investigate more closely are needed and constantly developed. To test these algorithms a software called SIMdrom has been developed to simulate test data. In this project, the simulated data is validated through comparison to real genetic data to ensure that it is suitable to use as test data. Through ensuring the data’s reliability and finding possible improvements, the development of algorithms for finding disease-causing mutations can be facilitated. This in-turn could lead to better diagnosing-possibilities for clinicians. When visualizing simulated data together with real genomes using principal components analysis, it clusters near it’s real counterpart. This shows that the simulated data resembles the real genomes. Simulated exomes also performed well when used as a part in one of three training sets for the classifier in the Prioritization of Exome Data by Image Analysis study. Here they perform second best after an in-house data set consisting of real exomes. To conclude, the SIMdrom simulated data performs well in both parts of this project. Additional tests of its validity should include testing against larger real data sets, an improvement possibility could be to implement a simulation option for spiking in noise.
169

Métagénomique comparative de novo à grande échelle / Large scale de novo comparative metagenomics

Benoit, Gaëtan 29 November 2017 (has links)
La métagénomique comparative est dite de novo lorsque les échantillons sont comparés sans connaissances a priori. La similarité est alors estimée en comptant le nombre de séquences d’ADN similaires entre les jeux de données. Un projet métagénomique génère typiquement des centaines de jeux de données. Chaque jeu contient des dizaines de millions de courtes séquences d’ADN de 100 à 200 nucléotides (appelées lectures). Dans le contexte du début de cette thèse, il aurait fallu des années pour comparer une telle masse de données avec les méthodes usuelles. Cette thèse présente des approches de novo pour calculer très rapidement la similarité entre de nombreux jeux de données. Les travaux que nous proposons se basent sur le k-mer (mot de taille k) comme unité de comparaison des métagénomes. La méthode principale développée pendant cette thèse, nommée Simka, calcule de nombreuses mesures de similarité en remplacement les comptages d’espèces classiquement utilisés par des comptages de grands k-mers (k > 21). Simka passe à l’échelle sur les projets métagénomiques actuels grâce à un nouvelle stratégie pour compter les k-mers de nombreux jeux de données en parallèle. Les expériences sur les données du projet Human Microbiome Projet et Tara Oceans montrent que les similarités calculées par Simka sont bien corrélées avec les similarités basées sur des comptages d’espèces ou d’OTUs. Simka a traité ces projets (plus de 30 milliards de lectures réparties dans des centaines de jeux) en quelques heures. C’est actuellement le seul outil à passer à l’échelle sur une telle quantité de données, tout en étant complet du point de vue des résultats de comparaisons. / Metagenomics studies the genomic content of a sample extracted from a natural environment. Among available analyses, comparative metagenomics aims at estimating the similarity between two or more environmental samples at the genomic level. The traditional approach compares the samples based on their content in known identified species. However, this method is biased by the incompleteness of reference databases. By contrast, de novo comparative metagenomics does not rely on a priori knowledge. Sample similarity is estimated by counting the number of similar DNA sequences between datasets. A metagenomic project typically generates hundreds of datasets. Each dataset contains tens of millions of short DNA sequences ranging from 100 to 150 base pairs (called reads). In the context of this thesis, it would require years to compare such an amount of data with usual methods. This thesis presents novel de novo approaches to quickly compute the similarity between numerous datasets. The main idea underlying our work is to use the k-mer (word of size k) as a comparison unit of the metagenomes. The main method developed during this thesis, called Simka, computes several similarity measures by replacing species counts by k-mer counts (k > 21). Simka scales-up today’s metagenomic projects thanks to a new parallel k-mer counting strategy on multiple datasets. Experiments on data from the Human Microbiome Project and Tara Oceans show that the similarities computed by Simka are well correlated with reference-based and OTU-based similarities. Simka processed these projects (more than 30 billions of reads distributed in hundreds of datasets) in few hours. It is currently the only tool able to scale-up such projects, while providing precise and extensive comparison results.
170

Predicting gene expression using artificial neural networks

Lindefelt, Lisa January 2002 (has links)
Today one of the greatest aims within the area of bioinformatics is to gain a complete understanding of the functionality of genes and the systems behind gene regulation. Regulatory relationships among genes seem to be of a complex nature since transcriptional control is the result of complex networks interpreting a variety of inputs. It is therefore essential to develop analytical tools detecting complex genetic relationships. This project examines the possibility of the data mining technique artificial neural network (ANN) detecting regulatory relationships between genes. As an initial step for finding regulatory relationships with the help of ANN the goal of this project is to train an ANN to predict the expression of an individual gene. The genes predicted are the nuclear receptor PPAR-g and the insulin receptor. Predictions of the two target genes respectively were made using different datasets of gene expression data as input for the ANN. The results of the predictions of PPAR-g indicate that it is not possible to predict the expression of PPAR-g under the circumstances for this experiment. The results of the predictions of the insulin receptor indicate that it is not possible to discard using ANN for predicting the gene expression of an individual gene.

Page generated in 0.1306 seconds