Return to search

Full Bayesian boolean network inference based on Markov chain Monte Carlo algorithms.

在生物信息學中, 基因調控網絡推斷不斷受到人們的重視。各種不同的網絡模型被用來描述基因之間的調控關係, 其中包括布爾網絡, 概率布爾網絡, 貝葉斯網絡等。本文主要是討論基於數據的布爾網絡推斷。現在已經有很多方法來推斷節點是離散變量的網絡結構。比如REVEAL算法,Best Fit Extension 算法是兩種比較受歡迎的推斷網絡結構方法。並且他們在網絡的節點數目不是很多的情況下有很好的表現。然而, 現今很多方法對噪音和模型的不確定性沒有足夠的考慮。這也使得這些方法在實際應用中的表現不是很令人滿意。本文中, 我們用完全貝葉斯的方法去研究概率布爾網絡空間。在給定樣本的情況下, 我們提出了一種新的基於馬爾科夫鏈蒙特卡羅的算法。這種算法使得不同的網絡模型根據他們的後驗概率在整個網絡空間中跳動。為使得網絡模型能更好地在不同模型中轉換,我們把局部小網絡根據他們的可能性分配給他們相應的概率值。這些可能的局部小網絡是在數據前期處理中通過卡方檢驗得到的。和其他同類方法一樣, 雖然我們的方法也同樣面臨著在一個很大的網絡空間中搜索的難題, 但我們的方法能達到一個更高的推斷精度。同時,我們的方法所對應的計算量也是在可接收範圍之內。 / In bioinformatics, the gene regulatory network inference is gaining intensive attention nowadays. Various network models have been used to describe gene regulatory relationships, including deterministic Boolean networks, probabilistic Boolean networks, Bayesian networks, etc. This dissertation is focused on data-based Boolean network reconstruction. Many methods have been proposed to infer this discrete network structure. For example, the REVEAL algorithm and the Best-Fit Extension method are popular and perform well for the networks with limited total number of nodes. However, existing methods didn't take full consideration of the ubiquitous noise across the network and the structure uncertainty, which makes these algorithms unsatisfactory in real applications. In this dissertation, we use a full Bayesian approach to explore the space of probabilistic Boolean networks. To compare the relative fitness of networks to the input data, we design novel Markov chain Monte Carlo algorithms to jump among con rained networks according to the joint posterior probability. To facilitate the transdimensional move, high proposing probabilities are assigned to more likely subnetwork models as judged by chi-square tests in the preprocessing step. Although faced with the same difficulty of searching in a huge structure space as other methods, our algorithm is expected to reconstruct the Boolean network in a more accurate and comprehensive manner with a bearable computing cost. / Detailed summary in vernacular field only. / Han, Shengtong. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 94-105). / Abstract also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 2 --- Technical Background --- p.5 / Chapter 2.1 --- Classical Boolean Network --- p.5 / Chapter 2.1.1 --- Definition --- p.5 / Chapter 2.1.2 --- Dynamic Properties --- p.8 / Chapter 2.2 --- Probabilistic Boolean Network --- p.9 / Chapter 2.2.1 --- Definition --- p.9 / Chapter 2.2.2 --- Dynamic Properties --- p.11 / Chapter 3 --- Bayesian Framework for Boolean Network Modeling --- p.12 / Chapter 3.1 --- Introduction --- p.12 / Chapter 3.2 --- Network Modeling --- p.15 / Chapter 3.2.1 --- Subnetwork Modeling --- p.15 / Chapter 3.2.2 --- Full Network Modeling --- p.21 / Chapter 3.2.3 --- Prior & Posterior Distributions --- p.23 / Chapter 4 --- Network Inference-MCMC --- p.29 / Chapter 4.1 --- Introduction --- p.29 / Chapter 4.2 --- Proposal Subnetwork Construction --- p.30 / Chapter 4.3 --- Network Structure Updating --- p.33 / Chapter 4.3.1 --- Individual Network Updating Moves --- p.33 / Chapter 4.3.2 --- Overall Network Updating Procedure --- p.37 / Chapter 4.3.3 --- The Core Metroplis-Hasting Algorithm --- p.37 / Chapter 4.4 --- Convergence Diagnostic --- p.40 / Chapter 4.5 --- Model Selection --- p.41 / Chapter 4.5.1 --- AIC, BIC --- p.42 / Chapter 4.5.2 --- Bayes Factor --- p.42 / Chapter 4.5.3 --- Reversible Jump MCMC --- p.43 / Chapter 4.5.4 --- Bayesian Model Averaging --- p.45 / Chapter 4.6 --- Computational Consideration --- p.46 / Chapter 5 --- Numerical Studies --- p.49 / Chapter 5.1 --- Simulation Studies --- p.49 / Chapter 5.1.1 --- Simulation for Synthetic Network Models with Small Number of Nodes --- p.50 / Chapter 5.1.2 --- Simulation for Synthetic Network Models with Large Number of Nodes --- p.64 / Chapter 5.2 --- Comparison with Other Methods --- p.68 / Chapter 5.2.1 --- Comparison Results --- p.71 / Chapter 5.2.2 --- Discussion --- p.72 / Chapter 6 --- Real Data Analysis --- p.74 / Chapter 6.1 --- A Real Cell Cycle Network --- p.74 / Chapter 6.2 --- Inference Result --- p.76 / Chapter 6.3 --- Discussion --- p.79 / Chapter 7 --- Summary and Discussion --- p.80 / Bibliography --- p.83 / Chapter A --- Data Pre-processing --- p.83 / Chapter A.1 --- Data Discretization --- p.83 / Chapter B --- Truth Tables for Commonly Used Basic Logic Functions --- p.85 / Chapter C --- All Distribution Tables for Gene Pairs and Gene Triplets --- p.86 / Chapter C.1 --- Distribution Assumptions for Input Gene Pairs --- p.86 / Chapter C.2 --- Distribution Assumptions for Gene Triplets --- p.87 / Chapter D --- Pseudo Code of the Algorithm --- p.91 / Chapter D.1 --- Case 1: In-degree=1 --- p.91 / Chapter D.2 --- Case 2: In-degree=2 --- p.93 / Chapter D.3 --- Case 3: In-degree=0 --- p.93

Identiferoai:union.ndltd.org:cuhk.edu.hk/oai:cuhk-dr:cuhk_328449
Date January 2012
ContributorsHan, Shengtong., Chinese University of Hong Kong Graduate School. Division of Statistics.
Source SetsThe Chinese University of Hong Kong
LanguageEnglish, Chinese
Detected LanguageEnglish
TypeText, bibliography
Formatelectronic resource, electronic resource, remote, 1 online resource (x, 105 leaves) : ill. (some col.)
RightsUse of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Page generated in 0.0022 seconds