• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 12
  • 11
  • 5
  • Tagged with
  • 121
  • 13
  • 10
  • 9
  • 8
  • 8
  • 7
  • 6
  • 6
  • 6
  • 5
  • 5
  • 5
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

Probabilistic inference in models of systems biology

Liu, Xin January 2014 (has links)
In Systems Biology, it is usual to use a set of ordinary differential equations to characterize biological function at a system level. The parameters in these equations generally reflect the reaction or decay rates of a molecular species, while states characterize the concentration values of species of interest, e.g. mRNA, proteins and metabolites. Often parameter values are estimated from in vitro experiments which may not be true reflections of the in vivo environments. With internal states, some may not be accessible for experimental measurement. Hence there is interest in estimating parameter values and states from noisy or incomplete observations taken at inputs/outputs of a system. This thesis explores several probabilistic inference approaches to do this. The study starts from a thorough investigation of the effectivenesses of the most commonly used one-pass inference methods, from which the non-parametric particle filtering approach is shown to be the most powerful method in the sequential category. After this study, the family of Approximate Bayesian Computation (ABC) methods, also known as likelihood-free batch approach, is reviewed chronologically and its advantages and deficiencies are summarized via a statistical toy example and two biological models. Additionally, a novel ABC method coupled with the sensitivity analysis technique has been developed and demonstrated on three periodic and one transient biological models. This approach has the potential to solve problem in high dimension by selectively allocating computational budget. In order to assess the capability of the proposed method in real-world problems, we have modeled the polymer pathway and conducted quantitative analysis via the proposed inference approach.
72

A systems biology approach to the production of biotechnological products through systematic in silico studies

Oshota, Olusegun James January 2012 (has links)
Background: Currently, the development of microbial strains for biotechnological production of chemicals and materials can be improved by using a rational metabolicengineering that may involve genetic engineering and/or systems biology techniques. Elementary ux mode analysis (EFM) and Flux balance analysis (FBA) are the twomost commonly used methods for probing the microbial network system properties for metabolic engineering purposes. EFM can be used to identify all possible pathways. However, combinatorial explosion of the number of EFMs obtained during EFM analysis, especially for large reaction networks, hinders the use of EFM data fordeveloping gene knockout strategies. The objective of this project was to identify interesting target products and design `proof of principle' Saccharomyces cerevisiaestrains capable of overproducing a target product; in this case lysine was chosen. Methods: EFMs were calculated for a reaction network from S. cerevisiae. In order to make sense of the large EFM solution space, a novel approach based on com-putational reduction and clustering of EFM datasets into subsets was developed,which aided the prediction of knockouts for lysine production. A Pattern analysismethod, based on regular expression matching, was also developed to interpret the EFM data. FBA frameworks, OptKnock and GDLS, were used to design in silcoproduction strains based on genome-scale models of yeast. Double and triple S. cerevisiae lysine producing strains were constructed using a PCR-based deletion method. Absolute and relative metabolome measurements for lysine and other metabolites in the single and double mutants were achieved using GC-TOF-MS.Results: The new computational and clustering methodology aided significantly the EFM-based in silico design of S. cerevisiae strains for enhanced yield of lysine andother value chemicals. Ethanol and lysine overproducing in silico strains were also developed by OptKnock and GDLS. Remarkably, the production strains with singledeletions, lsc2 and glt1, excreted into the medium five times the amount of lysine than the control strain. Five S. cerevisiae double mutant strains were successfullyconstructed. Two-fold increase in flux towards lysine production was demonstrated by S. cerevisiae double mutant M1, while both S. cerevisiae double mutants M4 andM5 showed about four-fold increase in lysine production. Conclusion: The general modelling and data reduction approaches developed here contributed in obviating the enormous problems associated with trying to obtainthe EFMs from large reaction network models and interpreting the resulting of large number of EFMs. EFM analysis aided the development of single and double S.cerevisiae mutant strains, capable of increased yield of lysine. The computational method was validated by construction of strains that are able to produce several foldmore lysine than the original strain.
73

Evolutionary analysis of animal microRNAs

Guerra Martins dos Santos Assunção, José Afonso January 2013 (has links)
In recent years, microRNAs (miRNAs) have been recognised as important genetic regulators of gene expression in Animals and Plants. They can potentially target a large fraction of the cellular transcriptome, having been shown to be important for diverse biological processes such as development, cell differentiation, proliferation and metabolism. The publication of the Human genome in 2001 marked the start of a great community effort to sequence a variety of other species. These data have great potential for comparative genomics, that can lead to better biological understanding. Some miRNA families are known to be highly conserved, across long evolutionary distances, many found in co-transcribed clusters across the genome. While these phenomena have been previously reported, a large-scale analysis of evolutionary patterns was still lacking. Furthermore, the rate at which new relevant data is being made available makes it challenging to keep up and many of the evolutionary studies performed before are now significantly out of date. This thesis describes a number of approaches taken to analyse miRNA datasets, harnessing the full potential of currently available data for comparative genomics. These were used, not only to revisit many of the notions in the field with a larger and updated dataset, but also to develop novel strategies that enable a coherent view of miRNA evolution at different evolutionary time-scales. A new tool, described within this thesis, was developed for large-scale, species independent miRNA mapping. An assessment of the evolution of the miRNA reper- toire across species was performed, together with detailed sequence conservation analysis and miRNA family clustering. Phylogenetic profile analysis uncovered in- teresting co-evolution between miRNAs and protein coding genes. The genomic organisation of miRNAs and their conservation across species was also studied, pro- viding detailed conserved synteny maps for miRNAs and proteins across more than 80 species. Finally, at the intra-specific level, I analysed the occurrence of single nucleotide polymorphisms affecting miRNA loci or their predicted target sites. All the tools built and integrated in this research were made available to the community and designed to be easily updated, making it easier to keep up with the data that is constantly being made available. Many aspects of miRNA biology are still being uncovered, and the ability to easily put these findings into an evolutionary context will potentially be useful for the community.
74

RNA sequencing for the study of gene expression regulation

Gonçalves, Ângela January 2012 (has links)
The process by which information encoded m an organism's DNA is used in the synthesis of functional cell products is known as gene expression. In recent years, sequencing of RNA (RNA-seq) has emerged as the preferred technology for the simultaneous measurement of transcript sequences and their abundance. The analysis of RNA-seq data presents novel challenges and many methods have been developed for the purpose of mapping reads to genomic features and expression quantification. In the first part of my thesis I developed an R based pipeline for pre-processing, expression estimation and data quality assessment of RNA-seq datasets, which formed the basis for my subsequent work on the evolution of gene expression regulation in mammals. Since changes in gene expression levels are thought to underlie many of the phenotypic differences between species, identifying and characterising the regulatory mechanisms responsible for these changes is an important goal of molecular biology. For this, I studied the regulatory divergence of liver gene expression and of isoform usage between mouse strains. I demonstrate that gene expression diverges extensively between the strains and propose that the regulatory mechanism underlying divergent expression between two closely related mammalian species is a combination of variants that arise in cis and in trans. Isoform usage diverges to a lesser extent and appears to display a larger contribution of trans acting regulatory elements to its regulation, suggesting that isoform usage may be under different evolutionary constraints. These observations have important implications for understanding mammalian gene expression divergence and for understanding how speciation occurs.
75

Time-resolved analysis of transcription factor induction and cell differentiation

Dvinge, Heidi January 2011 (has links)
No description available.
76

Prediction of gene expression in embryonic structures of Drosophila melanogaster

Samsonova, Anastasia A. January 2006 (has links)
No description available.
77

Ανάπτυξη διαδικτυακής εφαρμογής με σκοπό τη βέλτιστη ταυτοποίηση πεπτιδίων και πρωτεϊνών από δεδομένα πρωτεωμικής ανάλυσης

Αλεξανδρίδου, Αναστασία 08 February 2008 (has links)
Παρουσίαση των μεθόδων και των τεχνικών που χρησιμοποιούνται για την αναζήτηση πρωτεϊνικών και πεπτιδικών ακολουθιών σε βιολογικές βάσεις δεδομένων. Σκοπός της εργασίας είναι η δημιουργία διαδικτυακής εφαρμογής που θα λειτουργήσει ως ελεύθερα διαθέσιμο εργαλείο Βιοπληροφορικής μέσω του οποίου θα ταυτοποιούνται πεπτίδια και πρωτεϊνες από δεδομένα φασματογραφικής ανάλυσης ανεξαρτήτως της επεξεργασίας που έχουν υποστεί τα πρωτογενή δείγματα. / The methods used in searching proteinate and peptide sequences in biological databases are presented. The aim of this study is to create a free distributed Bioinformatics tool, implemented in network enviroment, to verify peptides and proteines traced by spectographic analysis, regerdless of the processing of the original samples.
78

New multi-label correlation-based feature selection methods for multi-label classification and application in bioinformatics

Jungjit, Suwimol January 2016 (has links)
The very large dimensionality of real world datasets is a challenging problem for classification algorithms, since often many features are redundant or irrelevant for classification. In addition, a very large number of features leads to a high computational time for classification algorithms. Feature selection methods are used to deal with the large dimensionality of data by selecting a relevant feature subset according to an evaluation criterion. The vast majority of research on feature selection involves conventional single-label classification problems, where each instance is assigned a single class label; but there has been growing research on more complex multi-label classification problems, where each instance can be assigned multiple class labels. This thesis proposes three types of new Multi-Label Correlation-based Feature Selection (ML-CFS) methods, namely: (a) methods based on hill-climbing search, (b) methods that exploit biological knowledge (still using hill-climbing search), and (c) methods based on genetic algorithms as the search method. Firstly, we proposed three versions of ML-CFS methods based on hill climbing search. In essence, these ML-CFS versions extend the original CFS method by extending the merit function (which evaluates candidate feature subsets) to the multi-label classification scenario, as well as modifying the merit function in other ways. A conventional search strategy, hill-climbing, was used to explore the space of candidate solutions (candidate feature subsets) for those three versions of ML-CFS. These ML-CFS versions are described in detail in Chapter 4. Secondly, in order to try to improve the performance of ML-CFS in cancer-related microarray gene expression datasets, we proposed three versions of the ML-CFS method that exploit biological knowledge. These ML-CFS versions are also based on hill-climbing search, but the merit function was modified in a way that favours the selection of genes (features) involved in pre-defined cancer-related pathways, as discussed in detail in Chapter 5. Lastly, we proposed two more sophisticated versions of ML-CFS based on Genetic Algorithms (rather than hill-climbing) as the search method. The first version of GA-based ML-CFS is based on a conventional single-objective GA, where there is only one objective to be optimized; while the second version of GA-based ML-CFS performs lexicographic multi-objective optimization, where there are two objectives to be optimized, as discussed in detail in Chapter 6. In this thesis, all proposed ML-CFS methods for multi-label classification problems were evaluated by measuring the predictive accuracies obtained by two well-known multi-label classification algorithms when using the selected featuresม namely: the Multi-Label K-Nearest neighbours (ML-kNN) algorithm and the Multi-Label Back Propagation Multi-Label Learning Neural Network (BPMLL) algorithm. In general, the results obtained by the best version of the proposed ML-CFS methods, namely a GA-based ML-CFS method, were competitive with the results of other multi-label feature selection methods and baseline approaches. More precisely, one of our GA-based methods achieved the second best predictive accuracy out of all methods being compared (both with ML-kNN and BPMLL used as classifiers), but there was no statistically significant difference between that GA-based ML-CFS and the best method in terms of predictive accuracy. In addition, in the experiment with ML-kNN (the most accurate) method selects about twice as many features as our GA-based ML-CFS; whilst in the experiments with BPMLL the most accurate method was a baseline method that does not perform any feature selection, and runs the classifier once (with all original features) for each of the many class labels, which is a very computationally expensive baseline approach. In summary, one of the proposed GA-based ML-CFS methods managed to achieve substantial data reduction, (selecting a smaller subset of relevant features) without a significant decrease in predictive accuracy with respect to the most accurate method.
79

Machine learning in systems biology at different scales : from molecular biology to ecology

Aderhold, Andrej January 2015 (has links)
Machine learning has been a source for continuous methodological advances in the field of computational learning from data. Systems biology has profited in various ways from machine learning techniques but in particular from network inference, i.e. the learning of interactions given observed quantities of the involved components or data that stem from interventional experiments. Originally this domain of system biology was confined to the inference of gene regulation networks but recently expanded to other levels of organization of biological and ecological systems. Especially the application to species interaction networks in a varying environment is of mounting importance in order to improve our understanding of the dynamics of species extinctions, invasions, and population behaviour in general. The aim of this thesis is to demonstrate an extensive study of various state-of-art machine learning techniques applied to a genetic regulation system in plants and to expand and modify some of these methods to infer species interaction networks in an ecological setting. The first study attempts to improve the knowledge about circadian regulation in the plant Arabidopsis thaliana from the view point of machine learning and gives suggestions on what methods are best suited for inference, how the data should be processed and modelled mathematically, and what quality of network learning can be expected by doing so. To achieve this, I generate a rich and realistic synthetic data set that is used for various studies under consideration of different effects and method setups. The best method and setup is applied to real transcriptional data, which leads to a new hypothesis about the circadian clock network structure. The ecological study is focused on the development of two novel inference methods that exploit a common principle from transcriptional time-series, which states that expression profiles over time can be temporally heterogeneous. A corresponding concept in a spatial domain of 2 dimensions is that species interaction dynamics can be spatially heterogeneous, i.e. can change in space dependent on the environment and other factors. I will demonstrate the expansion from the 1-dimensional time domain to the 2-dimensional spatial domain, introduce two distinct space segmentation schemes, and consider species dispersion effects with spatial autocorrelation. The two novel methods display a significant improvement in species interaction inference compared to competing methods and display a high confidence in learning the spatial structure of different species neighbourhoods or environments.
80

Novel mathematical and computational approaches for modelling biological systems

Chung, Andy Heung Wing January 2016 (has links)
This work presents the development, analysis and subsequent simulations of mathematical models aimed at providing a basis for modelling atherosclerosis. This cardiovascular disease is characterized by the growth of plaque in artery walls, forming lesions that protrude into the lumen. The rupture of these lesions contributes greatly to the number of cases of stroke and myocardial infarction. These are two of the main causes of death in the UK. Any work to understand the processes by which the disease initiates and progresses has the ultimate aim of limiting the disease through either its prevention or medical treatment and thus contributes a relevant addition to the growing body of research. The literature supports the view that the cause of atherosclerotic lesions is an in inflammatory process-succinctly put, excess amounts of certain biochemical species fed into the artery wall via the bloodstream spur the focal accumulation of extraneous cells. Therefore, suitable components of a mathematical model would include descriptions of the interactions of the various biochemical species and their movement in space and time. The models considered here are in the form of partial differential equations. Specifically, the following models are examined: first, a system of reaction-diffusion equations with coupling between surface and bulk species; second, a problem of optimisation to identify an unknown boundary; and finally, a system of advection-reaction-diffusion equations to model the assembly of keratin networks inside cells. These equations are approximated and solved computationally using the finite element method. The methods and algorithms shown aim to provide more accurate and efficient means to obtain solutions to such equations. Each model in this work is extensible and with elements from each model combined, they have scope to be a platform to give a fuller model of atherosclerosis.

Page generated in 0.3228 seconds