• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 20
  • 4
  • 4
  • 1
  • Tagged with
  • 38
  • 38
  • 13
  • 9
  • 8
  • 7
  • 6
  • 6
  • 6
  • 6
  • 6
  • 6
  • 6
  • 5
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Computational Methods for Knowledge Integration in the Analysis of Large-scale Biological Networks

January 2012 (has links)
abstract: As we migrate into an era of personalized medicine, understanding how bio-molecules interact with one another to form cellular systems is one of the key focus areas of systems biology. Several challenges such as the dynamic nature of cellular systems, uncertainty due to environmental influences, and the heterogeneity between individual patients render this a difficult task. In the last decade, several algorithms have been proposed to elucidate cellular systems from data, resulting in numerous data-driven hypotheses. However, due to the large number of variables involved in the process, many of which are unknown or not measurable, such computational approaches often lead to a high proportion of false positives. This renders interpretation of the data-driven hypotheses extremely difficult. Consequently, a dismal proportion of these hypotheses are subject to further experimental validation, eventually limiting their potential to augment existing biological knowledge. This dissertation develops a framework of computational methods for the analysis of such data-driven hypotheses leveraging existing biological knowledge. Specifically, I show how biological knowledge can be mapped onto these hypotheses and subsequently augmented through novel hypotheses. Biological hypotheses are learnt in three levels of abstraction -- individual interactions, functional modules and relationships between pathways, corresponding to three complementary aspects of biological systems. The computational methods developed in this dissertation are applied to high throughput cancer data, resulting in novel hypotheses with potentially significant biological impact. / Dissertation/Thesis / Ph.D. Computer Science 2012
22

Redução dimensional de dados de alta dimensão e poucas amostras usando Projection Pursuit / Dimension reduction of datasets with large dimensionalities and few samples using Projection Pursuit

Soledad Espezua Llerena 30 July 2013 (has links)
Reduzir a dimensão de bancos de dados é um passo importante em processos de reconhecimento de padrões e aprendizagem de máquina. Projection Pursuit (PP) tem emergido como uma técnica relevante para tal fim, a qual busca projeções dos dados em espaços de baixa dimensão onde estruturas interessantes sejam reveladas. Apesar do relativo sucesso de PP em vários problemas de redução dimensional, a literatura mostra uma aplicação limitada da mesma em bancos de dados com elevada quantidade de atributos e poucas amostras, tais como os gerados em biologia molecular. Nesta tese, estudam-se formas de aproveitar o potencial de PP em problemas de alta dimensão e poucas amostras a fim de facilitar a posterior construção de classificadores. Entre as principais contribuições deste trabalho tem-se: i) Sequential Projection Pursuit Modified (SPPM), um método de busca sequencial de espaços de projeção baseado em Algoritmo Genético (AG) e operadores de cruzamento especializados; ii) Block Sequential Projection Pursuit Modified (Block-SPPM) e Whitened Sequential Projection Pursuit Modified (W-SPPM), duas estratégias de aplicação de SPPM em problemas com mais atributos do que amostras, sendo a primeira baseada e particionamento de atributos e a segunda baseada em pré-compactação dos dados. Avaliações experimentais sobre bancos de dados públicos de expressão gênica mostraram a eficácia das propostas em melhorar a acurácia de algoritmos de classificação populares em relação a vários outros métodos de redução dimensional, tanto de seleção quanto de extração de atributos, encontrando-se que W-SPPM oferece o melhor compromisso entre acurácia e custo computacional. / Reducing the dimension of datasets is an important step in pattern recognition and machine learning processes. PP has emerged as a relevant technique for that purpose. PP aims to find projections of the data in low dimensional spaces where interesting structures are revealed. Despite the success of PP in many dimension reduction problems, the literature shows a limited application of it in dataset with large amounts of features and few samples, such as those obtained in molecular biology. In this work we study ways to take advantage of the potential of PP in order to deal with problems of large dimensionalities and few samples. Among the main contributions of this work are: i) SPPM, an improved method for searching projections, based on a genetic algorithm and specialized crossover operators; and ii) Block-SPPM and W-SPPM, two strategies of applying SPPM in problems with more attributes than samples. The first strategy is based on partitioning the attribute space while the later is based on a precompaction of the data followed by a projection search. Experimental evaluations over public gene-expression datasets showed the efficacy of the proposals in improving the accuracy of popular classifiers with respect to several representative dimension reduction methods, being W-SPPM the strategy with the best compromise between accuracy and computational cost.
23

A strategy for a systematic approach to biomarker discovery validation : a study on lung cancer microarray data set

Dol, Zulkifli January 2015 (has links)
Cancer is a serious threat to human health and is now one of major causes of death worldwide. However, the complexity of the cancer makes the development of new and specific diagnostic tools particularly challenging. A number of different strategies have been developed for biomarker discovery in cancer using microarray data. The problem that typically needs to be addressed is the scale of the data sets; we simply do not have (or are likely to obtain) sufficient data for classical machine learning approaches for biomarker discovery to be properly validated. Obtaining a biomarker that is specific to a particular cancer is also very challenging. The initial promise that was held out for gene microarray work for the development of cancer biomarkers has not yet yielded the hoped for breakthroughs. This work discusses the construction of a strategy for a systematic approach to biomarker discovery validation using lung cancer gene expression microarray data based around non-small cell cancer and in patients which either stayed disease free after surgery (a five year window) or in which the disease progressed and re-occurred. As a means of assisting the validation purposes we have therefore looked at new methodologies for using existing biological knowledge to support machine learning biomarker discovery techniques. We employ text mining strategy using previously published literature for correlating biological concepts to a given phenotype. Pathway driven approaches through the use of Web Services and workflows, enabled the large-scale dataset to be analysed systematically. The results showed that it was possible, at least using this specific data set, to clearly differentiate between progressive disease and disease free patients using a set of biomarkers implicated in neuroendocrine signaling. A validation of the biomarkers identified was attempted in three separately published data sets. This analysis showed that although there was support for some of our findings in one of these data sets, this appeared to be a function of the close similarity in experimental design followed rather than through specific of the analysis method developed.
24

Gene Selection by 1-D Discrete Wavelet Transform for Classifying Cancer Samples Using DNA Microarray Date

Jose, Adarsh 09 June 2009 (has links)
No description available.
25

Approaches to Find the Functionally Related Experiments Based on Enrichment Scores: Infinite Mixture Model Based Cluster Analysis for Gene Expression Data

Li, Qian 18 October 2013 (has links)
No description available.
26

Novel Monte Carlo Approaches to Identify Aberrant Pathways in Cancer

Gu, Jinghua 27 August 2013 (has links)
Recent breakthroughs in high-throughput biotechnology have promoted the integration of multi-platform data to investigate signal transduction pathways within a cell. In order to model complicated dynamics and heterogeneity of biological pathways, sophisticated computational models are needed to address unique properties of both the biological hypothesis and the data. In this dissertation work, we have proposed and developed methods using Markov Chain Monte Carlo (MCMC) techniques to solve complex modeling problems in human cancer research by integrating multi-platform data. We focus on two research topics: 1) identification of transcriptional regulatory networks and 2) uncovering of aberrant intracellular signal transduction pathways. We propose a robust method, called GibbsOS, to identify condition specific gene regulatory patterns between transcription factors and their target genes. A Gibbs sampler is employed to sample target genes from the marginal function of outlier sum of regression t statistic. Numerical simulation has demonstrated significant performance improvement of GibbsOS over existing methods against noise and false positive connections in binding data. We have applied GibbsOS to breast cancer cell line datasets and identified condition specific regulatory rewiring in human breast cancer. We also propose a novel method, namely Gibbs sampler to Infer Signal Transduction (GIST), to detect aberrant pathways that are highly associated with biological phenotypes or clinical information. By converting predefined potential functions into a Gibbs distribution, GIST estimates edge directions by learning the distribution of linear signaling pathway structures. Through the sampling process, the algorithm is able to infer signal transduction directions which are jointly determined by both gene expression and network topology. We demonstrate the advantage of the proposed algorithms on simulation data with respect to different settings of noise level in gene expression and false-positive connections in protein-protein interaction (PPI) network. Another major contribution of the dissertation work is that we have improved traditional perspective towards understanding aberrant signal transductions by further investigating structural linkage of signaling pathways. We develop a method called Structural Organization to Uncover pathway Landscape (SOUL), which emphasizes on modularized pathways structures from reconstructed pathway landscape. GIST and SOUL provide a very unique angle to computationally model alternative pathways and pathway crosstalk. The proposed new methods can bring insight to drug discovery research by targeting nodal proteins that oversee multiple signaling pathways, rather than treating individual pathways separately. A complete pathway identification protocol, namely Infer Modularization of PAthway CrossTalk (IMPACT), is developed to bridge downstream regulatory networks with upstream signaling cascades. We have applied IMPACT to breast cancer treated patient datasets to investigate how estrogen receptor (ER) signaling pathways are related to drug resistance. The identified pathway proteins from patient datasets are well supported by breast cancer cell line models. We hypothesize from computational results that HSP90AA1 protein is an important nodal protein that oversees multiple signaling pathways to drive drug resistance. Cell viability analysis has supported our hypothesis by showing a significant decrease in viability of endocrine resistant cells compared with non-resistant cells when 17-AAG (a drug that inhibits HSP90AA1) is applied. We believe that this dissertation work not only offers novel computational tools towards understanding complicated biological problems, but more importantly, it provides a valuable paradigm where systems biology connects data with hypotheses using computational modeling. Initial success of using microarray datasets to study endocrine resistance in breast cancer has shed light on translating results from high throughput datasets to biological discoveries in complicated human disease studies. As the next generation biotechnology becomes more cost-effective, the power of the proposed methods to untangle complicated aberrant signaling rewiring and pathway crosstalk will be finally unleashed. / Ph. D.
27

Análise de dados de expressão gênica: normalização de microarrays e modelagem de redes regulatórias / Gene expression data analysis: microarrays and regulatory networks modelling

Fujita, André 10 August 2007 (has links)
A análise da expressão gênica através de dados gerados em experimentos de microarrays de DNA vem possibilitando uma melhor compreensão da dinâmica e dos mecanismos envolvidos nos processos celulares ao nível molecular. O aprimoramento desta análise é crucial para o avanço do conhecimento sobre as bases moleculares das neoplasias e para a identificação de marcadores moleculares para uso em diagnóstico, desenho de novos medicamentos em terapias anti-tumorais. Este trabalho tem como objetivos o desenvolvimento de modelos de análise desses dados, propondo uma nova forma de normalização de dados provenientes de microarrays e dois modelos para a construção de redes regulatórias de expressão gênica, sendo uma baseada na conectividade dinâmica entre diversos genes ao longo do ciclo celular e a outra que resolve o problema da dimensionalidade, em que o número de experimentos de microarrays é menor que o número de genes. Apresenta-se, ainda, um pacote de ferramentas com uma interface gráfica de fácil uso contendo diversas técnicas de análise de dados já conhecidas como também as abordagens propostas neste trabalho. / The analyses of DNA microarrays gene expression data are allowing a better comprehension of the dynamics and mechanisms involved in cellular processes at the molecular level. In the cancer field, the improvement of gene expression interpretation is crucial to better understand the molecular basis of the neoplasias and to identify molecular markers to be used in diagnosis and in the design of new anti-tumoral drugs. The main goals of this work were to develop a new method to normalize DNA microarray data and two models to construct gene expression regulatory networks. One method analyses the dynamic connectivity between genes through the cell cycle and the other solves the dimensionality problem in regulatory networks, meaning that the number of experiments is lower than the number of genes. We also developed a toolbox with a user-friendly interface, displaying several established statistical methods implemented to analyze gene expression data as well as the new approaches presented in this work.
28

Análise de dados de expressão gênica: normalização de microarrays e modelagem de redes regulatórias / Gene expression data analysis: microarrays and regulatory networks modelling

André Fujita 10 August 2007 (has links)
A análise da expressão gênica através de dados gerados em experimentos de microarrays de DNA vem possibilitando uma melhor compreensão da dinâmica e dos mecanismos envolvidos nos processos celulares ao nível molecular. O aprimoramento desta análise é crucial para o avanço do conhecimento sobre as bases moleculares das neoplasias e para a identificação de marcadores moleculares para uso em diagnóstico, desenho de novos medicamentos em terapias anti-tumorais. Este trabalho tem como objetivos o desenvolvimento de modelos de análise desses dados, propondo uma nova forma de normalização de dados provenientes de microarrays e dois modelos para a construção de redes regulatórias de expressão gênica, sendo uma baseada na conectividade dinâmica entre diversos genes ao longo do ciclo celular e a outra que resolve o problema da dimensionalidade, em que o número de experimentos de microarrays é menor que o número de genes. Apresenta-se, ainda, um pacote de ferramentas com uma interface gráfica de fácil uso contendo diversas técnicas de análise de dados já conhecidas como também as abordagens propostas neste trabalho. / The analyses of DNA microarrays gene expression data are allowing a better comprehension of the dynamics and mechanisms involved in cellular processes at the molecular level. In the cancer field, the improvement of gene expression interpretation is crucial to better understand the molecular basis of the neoplasias and to identify molecular markers to be used in diagnosis and in the design of new anti-tumoral drugs. The main goals of this work were to develop a new method to normalize DNA microarray data and two models to construct gene expression regulatory networks. One method analyses the dynamic connectivity between genes through the cell cycle and the other solves the dimensionality problem in regulatory networks, meaning that the number of experiments is lower than the number of genes. We also developed a toolbox with a user-friendly interface, displaying several established statistical methods implemented to analyze gene expression data as well as the new approaches presented in this work.
29

TESTING FOR DIFFERENTIALLY EXPRESSED GENES AND KEY BIOLOGICAL CATEGORIES IN DNA MICROARRAY ANALYSIS

SARTOR, MAUREEN A. January 2007 (has links)
No description available.
30

Probabilistic Graphical Models for Prognosis and Diagnosis of Breast Cancer

KHADEMI, MAHMOUD 04 1900 (has links)
<p>One in nine women is expected to be diagnosed with breast cancer during her life. In 2013, an estimated 23, 800 Canadian women will be diagnosed with breast cancer and 5, 000 will die of it. Making decisions about the treatment for a patient is difficult since it depends on various clinical features, genomic factors, and pathological and cellular classification of a tumor.</p> <p>In this research, we propose a probabilistic graphical model for prognosis and diagnosis of breast cancer that can help medical doctors make better decisions about the best treatment for a patient. Probabilistic graphical models are suitable for making decisions under uncertainty from big data with missing attributes and noisy evidence.</p> <p>Using the proposed model, we may enter the results of different tests (e.g. estrogen and progesterone receptor test and HER2/neu test), microarray data, and clinical traits (e.g. woman's age, general health, menopausal status, stage of cancer, and size of the tumor) to the model and answer to following questions. How likely is it that the cancer will extend in the body (distant metastasis)? What is the chance of survival? How likely is that the cancer comes back (local or regional recurrence)? How promising is a treatment? For example, how likely metastasis is and how likely recurrence is for a new patient, if certain treatment e.g. surgical removal, radiation therapy, hormone therapy, or chemotherapy is applied. We can also classify various types of breast cancers using this model.</p> <p>Previous work mostly relied on clinical data. In our opinion, since cancer is a genetic disease, the integration of the genomic (microarray) and clinical data can improve the accuracy of the model for prognosis and diagnosis. However, increasing the number of variables may lead to poor results due to the curse of dimensionality dilemma and small sample size problem. The microarray data is high dimensional. It consists of around 25, 000 variables per patient. Moreover, structure learning and parameter learning for probabilistic graphical models require a significant amount of computations. The number of possible structures is also super-exponential with respect to the number of variables. For instance, there are more than 10^18 possible structures with just 10 variables.</p> <p>We address these problems by applying manifold learning and dimensionality reduction techniques to improve the accuracy of the model. Extensive experiments using real-world data sets such as METRIC and NKI show the accuracy of the proposed method for classification and predicting certain events, like recurrence and metastasis.</p> / Master of Science (MSc)

Page generated in 0.0848 seconds