• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 31
  • 6
  • 3
  • 3
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 52
  • 52
  • 46
  • 12
  • 11
  • 11
  • 10
  • 10
  • 8
  • 7
  • 7
  • 7
  • 6
  • 6
  • 5
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Learning gene interactions from gene expression data dynamic Bayesian networks

Sigursteinsdottir, Gudrun January 2004 (has links)
Microarray experiments generate vast amounts of data that evidently reflect many aspects of the underlying biological processes. A major challenge in computational biology is to extract, from such data, significant information and knowledge about the complex interplay between genes/proteins. An analytical approach that has recently gained much interest is reverse engineering of genetic networks. This is a very challenging approach, primarily due to the dimensionality of the gene expression data (many genes, few time points) and the potentially low information content of the data. Bayesian networks (BNs) and its extension, dynamic Bayesian networks (DBNs) are statistical machine learning approaches that have become popular for reverse engineering. In the present study, a DBN learning algorithm was applied to gene expression data produced from experiments that aimed to study the etiology of necrotizing enterocolitis (NEC), a gastrointestinal inflammatory (GI) disease that is the most common GI emergency in neonates. The data sets were particularly challenging for the DBN learning algorithm in that they contain gene expression measurements for relatively few time points, between which the sampling intervals are long. The aim of this study was, therefore, to evaluate the applicability of DBNs when learning genetic networks for the NEC disease, i.e. from the above-mentioned data sets, and use biological knowledge to assess the hypothesized gene interactions. From the results, it was concluded that the NEC gene expression data sets were not informative enough for effective derivation of genetic networks for the NEC disease with DBNs and Bayesian learning.
22

Deriving Genetic Networks from Gene Expression Data and Prior Knowledge

Lindlöf, Angelica January 2001 (has links)
<p>In this work three different approaches for deriving genetic association networks were tested. The three approaches were Pearson correlation, an algorithm based on the Boolean network approach and prior knowledge. Pearson correlation and the algorithm based on the Boolean network approach derived associations from gene expression data. In the third approach, prior knowledge from a known genetic network of a related organism was used to derive associations for the target organism, by using homolog matching and mapping the known genetic network to the related organism. The results indicate that the Pearson correlation approach gave the best results, but the prior knowledge approach seems to be the one most worth pursuing</p>
23

Learning gene interactions from gene expression data dynamic Bayesian networks

Sigursteinsdottir, Gudrun January 2004 (has links)
<p>Microarray experiments generate vast amounts of data that evidently reflect many aspects of the underlying biological processes. A major challenge in computational biology is to extract, from such data, significant information and knowledge about the complex interplay between genes/proteins. An analytical approach that has recently gained much interest is reverse engineering of genetic networks. This is a very challenging approach, primarily due to the dimensionality of the gene expression data (many genes, few time points) and the potentially low information content of the data. Bayesian networks (BNs) and its extension, dynamic Bayesian networks (DBNs) are statistical machine learning approaches that have become popular for reverse engineering. In the present study, a DBN learning algorithm was applied to gene expression data produced from experiments that aimed to study the etiology of necrotizing enterocolitis (NEC), a gastrointestinal inflammatory (GI) disease that is the most common GI emergency in neonates. The data sets were particularly challenging for the DBN learning algorithm in that they contain gene expression measurements for relatively few time points, between which the sampling intervals are long. The aim of this study was, therefore, to evaluate the applicability of DBNs when learning genetic networks for the NEC disease, i.e. from the above-mentioned data sets, and use biological knowledge to assess the hypothesized gene interactions. From the results, it was concluded that the NEC gene expression data sets were not informative enough for effective derivation of genetic networks for the NEC disease with DBNs and Bayesian learning.</p>
24

Analysis of Additive Risk Model with High Dimensional Covariates Using Correlation Principal Component Regression

Wang, Guoshen 22 April 2008 (has links)
One problem of interest is to relate genes to survival outcomes of patients for the purpose of building regression models to predict future patients¡¯ survival based on their gene expression data. Applying semeparametric additive risk model of survival analysis, this thesis proposes a new approach to conduct the analysis of gene expression data with the focus on model¡¯s predictive ability. The method modifies the correlation principal component regression to handle the censoring problem of survival data. Also, we employ the time dependent AUC and RMSEP to assess how well the model predicts the survival time. Furthermore, the proposed method is able to identify significant genes which are related to the disease. Finally, this proposed approach is illustrated by simulation data set, the diffuse large B-cell lymphoma (DLBCL) data set, and breast cancer data set. The results show that the model fits both of the data sets very well.
25

Analyzing Gene Expression Data in Terms of Gene Sets: Gene Set Enrichment Analysis

Li, Wei 01 December 2009 (has links)
The DNA microarray biotechnology simultaneously monitors the expression of thousands of genes and aims to identify genes that are differently expressed under different conditions. From the statistical point of view, it can be restated as identify genes strongly associated with the response or covariant of interest. The Gene Set Enrichment Analysis (GSEA) method is one method which focuses the analysis at the functional related gene sets level instead of single genes. It helps biologists to interpret the DNA microarray data by their previous biological knowledge of the genes in a gene set. GSEA has been shown to efficiently identify gene sets containing known disease-related genes in the real experiments. Here we want to evaluate the statistical power of this method by simulation studies. The results show that the the power of GSEA is good enough to identify the gene sets highly associated with the response or covariant of interest.
26

Replacing qpcr non-detects with microarray expression data : An initialized approach towards microarray and qPCR data integration

Sehlstedt, Jonas January 2018 (has links)
Gene expression analysis can be performed by a number of methods. One of the most common methods is using relative qPCR  to assess the relative expression of a determined set of genes compared to a reference gene. Analysis methods benefits from an as homogeneous sample set as possible, as great variety in original sample disease status, quality, type, or distribution may yield an uneven base expression between replicates. Additionally normalization of qPCR data will not work if there are missing values in the data. There are methods for handling non-detects (i.e. missing values) in the data, where most of them are only recommended to use when there is a single, or very few, value missing. By integrating microarray expression data with qPCR data, the data quality could be improved on, eradicating the need to redo an entire experiment when too much data is missing or sample data too is heterogeneous. In this project, publically available microarray data, with similar sample status of a given qPCR dataset, was downloaded and processed. The qPCR dataset included 51 genes, where a set of four DLG genes has been chosen for in-depth analysis. For handling missing values, mean imputation and inserting Cq value 40 were used, as well as a novel method initialized where microarray data was used to replace missing values. In summary replacing missing values with microarray data did not show any significant difference to the other two methods in three of the four DLG genes. From this project, it is also suggested an initialized approach towards testing the possibility of qPCR and microarray data integration.
27

Variance of Difference as Distance Like Measure in Time Series Microarray Data Clustering

Mukhopadhyay, Sayan January 2014 (has links) (PDF)
Our intention is to find similarity among the time series expressions of the genes in microarray experiments. It is hypothesized that at a given time point the concentration of one gene’s mRNA is directly affected by the concentration of other gene’s mRNA, and may have biological significance. We define dissimilarity between two time-series data set as the variance of Euclidean distances of each time points. The large numbers of gene expressions make the calculation of variance of distance in each point computationally expensive and therefore computationally challenging in terms of execution time. For this reason we use autoregressive model which estimates nineteen points gene expression to a three point vector. It allows us to find variance of difference between two data sets without point-to-point matching. Previous analysis from the microarray experiments data found that 62 genes are regulated following EGF (Epidermal Growth Factor) and HRG (Heregulin) treatment of the MCF-7 breast cancer cells. We have chosen these suspected cancer-related genes as our reference and investigated which additional set of genes has similar time point expression profiles. Keeping variance of difference as a measure of distance, we have used several methods for clustering the gene expression data, such as our own maximum clique finding heuristics and hierarchical clustering. The results obtained were validated through a text mining study. New predictions from our study could be a basis for further investigations in the genesis of breast cancer. Overall in 84 new genes are found in which 57 genes are related to cancer among them 35 genes are associated with breast cancer.
28

Meta-aprendizagem aplicada à classificação de dados de expressão gênica / Meta-learning applied to gene expression data classification

Bruno Feres de Souza 26 October 2010 (has links)
Dentre as aplicações mais comuns envolvendo microarrays, pode-se destacar a classificação de amostras de tecido, essencial para a identificação correta da ocorrência de câncer. Essa classificação é realizada com a ajuda de algoritmos de Aprendizagem de Máquina. A escolha do algoritmo mais adequado para um dado problema não é trivial. Nesta tese de doutorado, estudou-se a utilização de meta-aprendizagem como uma solução viável. Os resultados experimentais atestaram o sucesso da aplicação utilizando um arcabouço padrão para caracterização dos dados e para a construção da recomendação. A partir de então, buscou-se realizar melhorias nesses dois aspectos. Inicialmente, foi proposto um novo conjunto de meta-atributos baseado em índices de validação de agrupamentos. Em seguida, estendeu-se o método de construção de rankings kNN para ponderar a influência dos vizinhos mais próximos. No contexto de meta-regressão, introduziu-se o uso de SVMs para estimar o desempenho de algoritmos de classificação. Árvores de decisão também foram empregadas para a construção da recomendação de algoritmos. Ante seu desempenho inferior, empregou-se um esquema de comitês de árvores, que melhorou sobremaneira a qualidade dos resultados / Among the most common applications involving microarray, one can highlight the classification of tissue samples, which is essential for the correct identification of the occurrence of cancer and its type. This classification takes place with the aid of machine learning algorithms. Choosing the best algorithm for a given problem is not trivial. In this thesis, we studied the use of meta-learning as a viable solution. The experimental results confirmed the success of the application using a standard framework for characterizing data and constructing the recommendation. Thereafter, some improvements were made in these two aspects. Initially, a new set of meta-attributes was proposed, which are based on cluster validation indices. Then the kNN method for ranking construction was extended to weight the influence of nearest neighbors. In the context of meta-regression, the use of SVMs was introduced to estimate the performance of ranking algorithms. Decision trees were also employed for recommending algorithms. Due to their low performance, a ensemble of trees was employed, which greatly improved the quality of results
29

Application of Committee k-NN Classifiers for Gene Expression Profile Classification

Dhawan, Manik January 2008 (has links)
No description available.
30

Reporting and analyzing alternative clustering solutions by employing multi-objective genetic algorithm and conducting experiments on cancer data

Peng, P., Addam, O., Elzohbi, M., Ozyer, S., Elhajj, Ahmad, Gao, S., Liu, Y., Ozyer, T., Kaya, M., Ridley, Mick J., Rokne, J., Alhajj, R. 14 November 2013 (has links)
No / Clustering is an essential research problem which has received considerable attention in the research community for decades. It is a challenge because there is no unique solution that fits all problems and satisfies all applications. We target to get the most appropriate clustering solution for a given application domain. In other words, clustering algorithms in general need prior specification of the number of clus- ters, and this is hard even for domain experts to estimate especially in a dynamic environment where the data changes and/or become available incrementally. In this paper, we described and analyze the effec- tiveness of a robust clustering algorithm which integrates multi-objective genetic algorithm into a frame- work capable of producing alternative clustering solutions; it is called Multi-objective K-Means Genetic Algorithm (MOKGA). We investigate its application for clustering a variety of datasets, including micro- array gene expression data. The reported results are promising. Though we concentrate on gene expres- sion and mostly cancer data, the proposed approach is general enough and works equally to cluster other datasets as demonstrated by the two datasets Iris and Ruspini. After running MOKGA, a pareto-optimal front is obtained, and gives the optimal number of clusters as a solution set. The achieved clustering results are then analyzed and validated under several cluster validity techniques proposed in the litera- ture. As a result, the optimal clusters are ranked for each validity index. We apply majority voting to decide on the most appropriate set of validity indexes applicable to every tested dataset. The proposed clustering approach is tested by conducting experiments using seven well cited benchmark data sets. The obtained results are compared with those reported in the literature to demonstrate the applicability and effectiveness of the proposed approach.

Page generated in 0.2896 seconds