• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 362
  • 47
  • 32
  • 20
  • 17
  • 10
  • 8
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 687
  • 687
  • 354
  • 180
  • 165
  • 104
  • 96
  • 94
  • 86
  • 79
  • 77
  • 77
  • 76
  • 73
  • 73
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
71

A tree based algorithm for predicting protein-DNA binding cores.

January 2012 (has links)
轉錄因子(TF) 和轉錄因子結合位點(TFBS) 之間的結合(binding) 是重要的生物信息學課題。高清晰度(長度<10 )的結合核心(binding core) 是從昂貴和費時的三維結構實驗中發現的。因此,我們希望開發一種以序列為基礎的高效計算方法,提供高信心的結合核心作為實驗對象,以提高三維結構實驗的效率。雖然現有很多基於序列的motif辨認算法,但很少有直接針對關聯TF和TFBS的結合核心的。在不使用任何三維結構的結合核心下,最近我們應用了關聯規則挖掘方法於低分辨率的(TF長度>490) 結合序列準確地發掘出高清晰度結合核心,然而,這種方法有幾個缺點。在這篇論文中,我們正式地定義了使用關聯規則挖掘預測蛋白質-脫氧核糖核酸(DNA) 結合核心的問題和開發了一個以樹為基礎的算法以克服前一種方法的缺點。 / 目前的關聯規則挖掘方法在這個問題上只能解決確切的序列,而最近的近似方法並沒有採用任何正式的模型,並且受限於實驗已知的序列。由於生物的基因突變是常見的,因此我們進一步定義開採近似的蛋白質-DNA序列結合核心的問題,並延伸該算法至預測近似的蛋白質-DNA結合核心。真實數據的實驗結果中表明了在該算法在預測新的TF-TFBS結合核心中的性能和適用性。最後,我們提出、測試並討論了多種減少雜訊以提高結果質量的方案。其中,當最小支持度(minimumsupport) 的限制定得低時,統計檢驗能有效地從結果中删除雜訊。 / The studies of protein-DNA bindings between transcription fac-tors (TFs) and transcription factor binding sites (TFBSs) are important bioinformatics topics. Currently, high-resolution (length < 10) TF-TFBS binding cores are discovered by expensive and time-consuming 3D structure experiments. Thus, we are motivated to develop a cheap and efficient sequence-based computational method for providing testable novel binding cores with high condence to accelerate the experiments. Although there are abundant sequence-based motif discovery algorithms, few directly address associating both TF and TFBS core motifs, which are both veriable on 3D structures. Recent association rule mining approaches on low-resolution binding sequences (TF length > 490) are shown promising in identifying accurate binding cores without using any 3D structures, however, the approach has several drawbacks. In this thesis, the problem of predicting protein-DNA binding cores using association rule mining is formally dened and a novel tree-based algorithm is developed to overcome the disadvantages of the previous approach. / While the previous association rule mining method on this problem addresses exact sequences only, the most recent ad hoc method for approximation does not establish any formal model and is limited by experimentally known patterns. As biological mutations are common, it is desirable to formally extend the exact model into an approximate one. Thus, we further formalize the problem of mining approximate protein-DNA association rules from sequence data and extend the proposed algorithm to predict approximate protein-DNA binding cores. Experimental results on real data show the performance and applicability of the proposed algorithm in predicting novel TF-TFBS binding cores. Finally, several methods for reducing noise and thus improving the quality of the mined rules are proposed and discussed. Particularly, statistical tests give impressive result on removing noise when the minimum support threshold is small. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Wong, Po Yuen. / Thesis (M.Phil.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 126-136). / Abstracts also in Chinese. / Abstract --- p.i / Acknowledgement --- p.vi / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Predicting Protein-DNA Binding Cores --- p.1 / Chapter 1.2 --- Contributions --- p.3 / Chapter 1.3 --- Thesis Outline --- p.4 / Chapter 2 --- Background --- p.6 / Chapter 2.1 --- Biological Background --- p.7 / Chapter 2.1.1 --- The Central Dogma of Molecular Biology --- p.7 / Chapter 2.1.2 --- Transcriptional Regulation --- p.10 / Chapter 2.1.3 --- Experiments on studying TF-TFBS bindings --- p.12 / Chapter 2.2 --- Computational Background --- p.13 / Chapter 2.2.1 --- Motif Discovery --- p.13 / Chapter 2.2.2 --- Association Rule Mining --- p.14 / Chapter 2.2.3 --- Frequent Pattern Mining --- p.16 / Chapter 2.3 --- TF-TFBS Binding Rule Mining in Bioinformatics --- p.17 / Chapter 3 --- Mining TF-TFBS Rules --- p.23 / Chapter 3.1 --- Introduction --- p.24 / Chapter 3.2 --- Problem Definition --- p.25 / Chapter 3.3 --- Frequent Sequence Tree (FS-Tree) --- p.31 / Chapter 3.3.1 --- Semantic of FS-Tree --- p.31 / Chapter 3.3.2 --- Construction of FS-Tree --- p.34 / Chapter 3.4 --- The algorithm --- p.40 / Chapter 3.4.1 --- Correctness --- p.42 / Chapter 3.5 --- Results --- p.44 / Chapter 3.5.1 --- Performance --- p.45 / Chapter 3.5.2 --- Verification using 3D-Structures --- p.53 / Chapter 3.6 --- Discussion and Conclusion --- p.58 / Chapter 3.6.1 --- Parameters Setting --- p.59 / Chapter 3.6.2 --- Deduplication --- p.60 / Chapter 4 --- Extension to Approximate TF-TFBS Rules --- p.63 / Chapter 4.1 --- Introduction --- p.65 / Chapter 4.2 --- Problem Definition --- p.66 / Chapter 4.3 --- Frequent Sequence Class Tree --- p.74 / Chapter 4.4 --- The extended algorithm --- p.82 / Chapter 4.4.1 --- Correctness --- p.87 / Chapter 4.5 --- Results --- p.89 / Chapter 4.5.1 --- Performance --- p.89 / Chapter 4.5.2 --- Verification using PDB --- p.94 / Chapter 4.6 --- Discussion and Conclusion --- p.100 / Chapter 5 --- Noise Reducing Methods --- p.102 / Chapter 5.1 --- Introduction --- p.103 / Chapter 5.2 --- Reducing Noise within a TFBS Group --- p.104 / Chapter 5.2.1 --- Using Exact Count Threshold --- p.106 / Chapter 5.2.2 --- Using Minimum Support --- p.108 / Chapter 5.2.3 --- Using Minimum Approximate Support --- p.110 / Chapter 5.3 --- Reducing Noise using Statistical Test --- p.112 / Chapter 5.3.1 --- A Simple Model --- p.114 / Chapter 5.3.2 --- Statistical Model with Transactions --- p.116 / Chapter 5.4 --- Discussion and Conclusion --- p.120 / Chapter 6 --- Conclusion --- p.121 / Chapter 6.1 --- Conclusion --- p.121 / Chapter 6.2 --- Future Work --- p.123 / Bibliography --- p.126 / Chapter A --- Publications --- p.137 / Chapter A.1 --- Publications --- p.137
72

The multi-faceted RNA molecule : Characterization and Function in the regulation of Gene Expression

Ensterö, Mats January 2008 (has links)
<p>In this thesis I have studied the RNA molecule and its function and characteristics in the regulation of gene expression. I have focused on two events that are important for the regulation of the transcriptome: Translational regulation through micro RNAs; and RNA editing through adenosine deaminations.</p><p>Micro RNAs (miRNAs) are ~22 nucleotides long RNA molecules that by semi complementarity bind to untranslated regions of a target messenger RNA (mRNA). The interaction manifests through an RNA/protein complex and act mainly by repressing translation of the target mRNA. I have shown that a pre-cursor miRNA molecule have significantly different information content of sequential composition of the two arms of the pre-cursor hairpin. I have also shown that sequential composition differs between species.</p><p>Selective adenosine to inosine (A-to-I) RNA editing is a post-transcriptional process whereby highly specific adenosines in a (pre-)messenger transcript are deaminated to inosines. The deamination is carried out by the ADAR family of proteins and require a specific sequential and structural landscape for target recognition. Only a handful of messenger substrates have been found to be site selectively edited in mammals. Still, most of these editing events have an impact on neurotransmission in the brain.</p><p>In order to find novel substrates for A-to-I editing, an experimental setup was made to extract RNA targets of the ADAR2 enzyme. In concert with this experimental approach, I have constructed a computational screen to predict specific positions prone to A-to-I editing.</p><p>Further, I have analyzed editing in the mouse brain at four different developmental stages by 454 amplicon sequencing. With high resolution, I present data supporting a general developmental regulation of A-to-I editing. I also present data of coupled editing events on single RNA transcripts suggesting an A-to-I editing mechanism that involve ADAR dimers to act in concert. A different editing pattern is seen for the serotonin receptor 5-ht2c.</p>
73

The multi-faceted RNA molecule : Characterization and Function in the regulation of Gene Expression

Ensterö, Mats January 2008 (has links)
In this thesis I have studied the RNA molecule and its function and characteristics in the regulation of gene expression. I have focused on two events that are important for the regulation of the transcriptome: Translational regulation through micro RNAs; and RNA editing through adenosine deaminations. Micro RNAs (miRNAs) are ~22 nucleotides long RNA molecules that by semi complementarity bind to untranslated regions of a target messenger RNA (mRNA). The interaction manifests through an RNA/protein complex and act mainly by repressing translation of the target mRNA. I have shown that a pre-cursor miRNA molecule have significantly different information content of sequential composition of the two arms of the pre-cursor hairpin. I have also shown that sequential composition differs between species. Selective adenosine to inosine (A-to-I) RNA editing is a post-transcriptional process whereby highly specific adenosines in a (pre-)messenger transcript are deaminated to inosines. The deamination is carried out by the ADAR family of proteins and require a specific sequential and structural landscape for target recognition. Only a handful of messenger substrates have been found to be site selectively edited in mammals. Still, most of these editing events have an impact on neurotransmission in the brain. In order to find novel substrates for A-to-I editing, an experimental setup was made to extract RNA targets of the ADAR2 enzyme. In concert with this experimental approach, I have constructed a computational screen to predict specific positions prone to A-to-I editing. Further, I have analyzed editing in the mouse brain at four different developmental stages by 454 amplicon sequencing. With high resolution, I present data supporting a general developmental regulation of A-to-I editing. I also present data of coupled editing events on single RNA transcripts suggesting an A-to-I editing mechanism that involve ADAR dimers to act in concert. A different editing pattern is seen for the serotonin receptor 5-ht2c.
74

Machine Learning Approaches to Biological Sequence and Phenotype Data Analysis

Min, Renqiang 17 February 2011 (has links)
To understand biology at a system level, I presented novel machine learning algorithms to reveal the underlying mechanisms of how genes and their products function in different biological levels in this thesis. Specifically, at sequence level, based on Kernel Support Vector Machines (SVMs), I proposed learned random-walk kernel and learned empirical-map kernel to identify protein remote homology solely based on sequence data, and I proposed a discriminative motif discovery algorithm to identify sequence motifs that characterize protein sequences' remote homology membership. The proposed approaches significantly outperform previous methods, especially on some challenging protein families. At expression and protein level, using hierarchical Bayesian graphical models, I developed the first high-throughput computational predictive model to filter sequence-based predictions of microRNA targets by incorporating the proteomic data of putative microRNA target genes, and I proposed another probabilistic model to explore the underlying mechanisms of microRNA regulation by combining the expression profile data of messenger RNAs and microRNAs. At cellular level, I further investigated how yeast genes manifest their functions in cell morphology by performing gene function prediction from the morphology data of yeast temperature-sensitive alleles. The developed prediction models enable biologists to choose some interesting yeast essential genes and study their predicted novel functions.
75

Machine Learning Approaches to Biological Sequence and Phenotype Data Analysis

Min, Renqiang 17 February 2011 (has links)
To understand biology at a system level, I presented novel machine learning algorithms to reveal the underlying mechanisms of how genes and their products function in different biological levels in this thesis. Specifically, at sequence level, based on Kernel Support Vector Machines (SVMs), I proposed learned random-walk kernel and learned empirical-map kernel to identify protein remote homology solely based on sequence data, and I proposed a discriminative motif discovery algorithm to identify sequence motifs that characterize protein sequences' remote homology membership. The proposed approaches significantly outperform previous methods, especially on some challenging protein families. At expression and protein level, using hierarchical Bayesian graphical models, I developed the first high-throughput computational predictive model to filter sequence-based predictions of microRNA targets by incorporating the proteomic data of putative microRNA target genes, and I proposed another probabilistic model to explore the underlying mechanisms of microRNA regulation by combining the expression profile data of messenger RNAs and microRNAs. At cellular level, I further investigated how yeast genes manifest their functions in cell morphology by performing gene function prediction from the morphology data of yeast temperature-sensitive alleles. The developed prediction models enable biologists to choose some interesting yeast essential genes and study their predicted novel functions.
76

Probabilistic Graphical Models and Algorithms for

Jiao, Feng January 2008 (has links)
In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein structure prediction and protein structure alignment. First,in the machine learning work, I focus on a special kind of graphical model---conditional random fields (CRFs). Here, I present a new semi-supervised training procedure for CRFs that can be used to train sequence segmentors and labellers from a combination of labeled and unlabeled training data. Such learning algorithms can be applied to protein and gene name entity recognition problems. This work provides one of the first semi-supervised discriminative training methods for structured classification. Second, in my computational biology work, I focus mainly on protein problems. In particular, I first propose a tree decomposition method for solving the protein structure prediction and protein structure alignment problems. In so doing, I reveal why tree decomposition is a good method for many protein problems. Then, I propose a computational framework for detection of similar structures of a target protein with sparse NMR data, which can help to predict protein structure using experimental data. Finally, I propose a new machine learning approach---LS_Boost---to solve the protein fold recognition problem, which is one of the key steps in protein structure prediction. After a thorough comparison, the algorithm is proved to be both more accurate and more efficient than traditional z-Score method and other machine learning methods.
77

Probabilistic Graphical Models and Algorithms for

Jiao, Feng January 2008 (has links)
In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein structure prediction and protein structure alignment. First,in the machine learning work, I focus on a special kind of graphical model---conditional random fields (CRFs). Here, I present a new semi-supervised training procedure for CRFs that can be used to train sequence segmentors and labellers from a combination of labeled and unlabeled training data. Such learning algorithms can be applied to protein and gene name entity recognition problems. This work provides one of the first semi-supervised discriminative training methods for structured classification. Second, in my computational biology work, I focus mainly on protein problems. In particular, I first propose a tree decomposition method for solving the protein structure prediction and protein structure alignment problems. In so doing, I reveal why tree decomposition is a good method for many protein problems. Then, I propose a computational framework for detection of similar structures of a target protein with sparse NMR data, which can help to predict protein structure using experimental data. Finally, I propose a new machine learning approach---LS_Boost---to solve the protein fold recognition problem, which is one of the key steps in protein structure prediction. After a thorough comparison, the algorithm is proved to be both more accurate and more efficient than traditional z-Score method and other machine learning methods.
78

Systems Medicine: An Integrated Approach with Decision Making Perspective

Faryabi, Babak 14 January 2010 (has links)
Two models are proposed to describe interactions among genes, transcription factors, and signaling cascades involved in regulating a cellular sub-system. These models fall within the class of Markovian regulatory networks, and can accommodate for different biological time scales. These regulatory networks are used to study pathological cellular dynamics and discover treatments that beneficially alter those dynamics. The salient translational goal is to design effective therapeutic actions that desirably modify a pathological cellular behavior via external treatments that vary the expressions of targeted genes. The objective of therapeutic actions is to reduce the likelihood of the pathological phenotypes related to a disease. The task of finding effective treatments is formulated as sequential decision making processes that discriminate the gene-expression profiles with high pathological competence versus those with low pathological competence. Thereby, the proposed computational frameworks provide tools that facilitate the discovery of effective drug targets and the design of potent therapeutic actions on them. Each of the proposed system-based therapeutic methods in this dissertation is motivated by practical and analytical considerations. First, it is determined how asynchronous regulatory models can be used as a tool to search for effective therapeutic interventions. Then, a constrained intervention method is introduced to incorporate the side-effects of treatments while searching for a sequence of potent therapeutic actions. Lastly, to bypass the impediment of model inference and to mitigate the numerical challenges of exhaustive search algorithms, a heuristic method is proposed for designing system-based therapies. The presentation of the key ideas in method is facilitated with the help of several case studies.
79

Dynamics and asymptotic behaviors of biochemical networks

Wang, Liming, January 2008 (has links)
Thesis (Ph. D.)--Rutgers University, 2008. / "Graduate Program in Mathematics." Includes bibliographical references (p. 147-153).
80

Quantitative studies of aging using statistical mechanics and probabilistic approaches

David-Rus, Diana. January 2009 (has links)
Thesis (Ph. D.)--Rutgers University, 2009. / "Graduate Program in Computational Biology and Molecular Biophysics." Includes bibliographical references.

Page generated in 0.1846 seconds