• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 1
  • 1
  • Tagged with
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Bayesian approach for two model-selection-related bioinformatics problems. / CUHK electronic theses & dissertations collection

January 2013 (has links)
在貝葉斯推理框架下,貝葉斯方法可以通過數據推斷複雜概率模型中的參數和結構。它被廣泛應用於多个領域。對於生物信息學問題,貝葉斯方法同樣也是一個理想的方法。本文通過介紹新的貝葉斯模型和計算方法討論並解決了兩個與模型選擇相關的生物信息學問題。 / 第一個問題是關於在DNA 序列中的模式識別的相關研究。串聯重複序列片段在DNA 序列中經常出現。它對於基因組進化和人類疾病的研究非常重要。在這一部分,本文主要討論不確定數目的同一模式的串聯重複序列彌散分佈在同一個序列中的情況。我們首先對串聯重複序列片段構建概率模型。然後利用馬爾可夫鏈蒙特卡羅算法探索後驗分佈進而推斷出串聯重複序列的重複片段的模式矩陣和位置。此外,利用RJMCMC 算法解決由不確定數目的重複片段引起的模型選擇問題。 / 另一個問題是對於生物分子的構象轉換的分析。一組生物分子的構象可被分成幾個不同的亞穩定狀態。由於生物分子的功能和構象之間的固有聯繫,構象轉變在不同的生物分子的生物過程中都扮演者非常重要的角色。一般我們從分子動力學模擬中可以得到構象轉換的數據。基於從分子動力學模擬中得到的微觀狀態水準上的構象轉換資訊,我們利用貝葉斯方法研究從微觀狀態到可變數目的亞穩定狀態的聚合問題。 / 本文通過對以上兩個問題討論闡釋貝葉斯方法在生物信息學研究的多個方面具備優勢。這包括闡述生物問題的多變性,處理噪聲和失數據,以及解決模型選擇問題。 / Bayesian approach is a powerful framework for inferring the parameters and structures of complicated probabilistic models from data. It is widely applied in many areas and also ideal for Bioinformatics problems due to their usually high complexity. In this thesis, new Bayesian models and computing methods are introduced to solve two Bioinformatics problems which are both related to model selection. / The first problem is about the repeat pattern recognition. Tandem repeats occur frequently in DNA sequences. They are important for studying genome evolution and human disease. This thesis focuses on the case that an unknown number of tandem repeat segments of the same pattern are dispersively distributed in a sequence. A probabilistic generative model is introduced for the tandem repeats. Markov chain Monte Carlo algorithms are used to explore the posterior distribution as an effort to infer both the specific pattern of the tandem repeats and the location of repeat segments. Furthermore, reversible jump Markov chain Monte Carlo algorithms are used to address the transdimensional model selection problem raised by the variable number of repeat segments. / The second part of this thesis is engaged in the conformational transitions of biomolecules. Because the function of a biological biomolecule is inherently related to its variable conformations which can be grouped into a set of metastable or long-live states, conformational transitions are important in biological processes. The 3D structure changes are generally simulated from the molecular dynamics computer simulation. Based on the conformational transitions on microstate level from molecular dynamics simulation, a Bayesian approach is developed to cluster the microstates into an uncertainty number of metastable that induces the model selection problem. / With these two problems, this thesis shows that the Bayesian approach for bioinformatics problems has its advantages in terms of taking account of the inherent uncertainty in biological data, handling noisy or missing data, and dealing with the model selection problem. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Liang, Tong. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2013. / Includes bibliographical references (leaves 120-130). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstracts also in Chinese. / Abstract --- p.i / Acknowledgement --- p.iv / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivation --- p.1 / Chapter 1.2 --- Statistical Background --- p.2 / Chapter 1.3 --- Tandem Repeats --- p.4 / Chapter 1.4 --- Conformational Space --- p.5 / Chapter 1.5 --- Outlines --- p.7 / Chapter 2 --- Preliminaries --- p.9 / Chapter 2.1 --- Bayesian Inference --- p.9 / Chapter 2.2 --- Markov chain Monte Carlo --- p.10 / Chapter 2.2.1 --- Gibbs sampling --- p.11 / Chapter 2.2.2 --- Metropolis - Hastings algorithm --- p.12 / Chapter 2.2.3 --- Reversible Jump MCMC --- p.12 / Chapter 3 --- Detection of Dispersed Short Tandem Repeats Using Reversible Jump MCMC --- p.14 / Chapter 3.1 --- Background --- p.14 / Chapter 3.2 --- Generative Model --- p.17 / Chapter 3.3 --- Statistical inference --- p.18 / Chapter 3.3.1 --- Likelihood --- p.19 / Chapter 3.3.2 --- Prior Distributions --- p.19 / Chapter 3.3.3 --- Sampling from Posterior Distribution via RJMCMC --- p.20 / Chapter 3.3.4 --- Extra MCMC moves for better mixing --- p.26 / Chapter 3.3.5 --- The complete algorithm --- p.29 / Chapter 3.4 --- Experiments --- p.29 / Chapter 3.4.1 --- Evaluation and comparison of the two RJMCMC versions using synthetic data --- p.30 / Chapter 3.4.2 --- Comparison with existing methods using synthetic data --- p.33 / Chapter 3.4.3 --- Sensitivity to Priors --- p.43 / Chapter 3.4.4 --- Real data experiment --- p.45 / Chapter 3.5 --- Discussion --- p.50 / Chapter 4 --- A Probabilistic Clustering Algorithm for Conformational Changes of Biomolecules --- p.53 / Chapter 4.1 --- Introduction --- p.53 / Chapter 4.1.1 --- Molecular dynamic simulation --- p.54 / Chapter 4.1.2 --- Hierarchical Conformational Space --- p.55 / Chapter 4.1.3 --- Clustering Algorithms --- p.56 / Chapter 4.2 --- Generative Model --- p.58 / Chapter 4.2.1 --- Model 1: Vanilla Model --- p.59 / Chapter 4.2.2 --- Model 2: Zero-Inflated Model --- p.60 / Chapter 4.2.3 --- Model 3: Constrained Model --- p.61 / Chapter 4.2.4 --- Model 4: Constrained and Zero-Inflated Model --- p.61 / Chapter 4.3 --- Statistical Inference for Vanilla Model --- p.62 / Chapter 4.3.1 --- Priors --- p.62 / Chapter 4.3.2 --- Posterior distribution --- p.63 / Chapter 4.3.3 --- Collapsed Gibbs for Vanilla Model with a Fixed Number of Clusters --- p.63 / Chapter 4.3.4 --- Inference on the Number of Clusters --- p.65 / Chapter 4.3.5 --- Synthetic Data Study --- p.68 / Chapter 4.4 --- Statistical Inference for Zero-Inflated Model --- p.76 / Chapter 4.4.1 --- Method 1 --- p.78 / Chapter 4.4.2 --- Method 2 --- p.81 / Chapter 4.4.3 --- Synthetic Data Study --- p.84 / Chapter 4.5 --- Statistical Inference for Constrained Model --- p.85 / Chapter 4.5.1 --- Priors --- p.85 / Chapter 4.5.2 --- Posterior Distribution --- p.86 / Chapter 4.5.3 --- Collapsed Posterior Distribution --- p.86 / Chapter 4.5.4 --- Updating for Cluster Labels K --- p.89 / Chapter 4.5.5 --- Updating for Constrained Λ from Truncated Distribution --- p.89 / Chapter 4.5.6 --- Updating the Number of Clusters --- p.91 / Chapter 4.5.7 --- Uniform Background Parameters on Λ --- p.92 / Chapter 4.6 --- Real Data Experiments --- p.93 / Chapter 4.7 --- Discussion --- p.104 / Chapter 5 --- Conclusion and FutureWork --- p.107 / Chapter A --- Appendix --- p.109 / Chapter A.1 --- Post-processing for indel treatment --- p.109 / Chapter A.2 --- Consistency Score --- p.111 / Chapter A.3 --- A Proof for Collapsed Posterior distribution in Constrained Model in Chapter 4 --- p.111 / Chapter A.4 --- Estimated Transition Matrices for Alanine Dipeptide by Chodera et al. (2006) --- p.117 / Bibliography --- p.120
2

Bioinformatics-inspired binary image correlation: application to bio-/medical-images, microsarrays, finger-prints and signature classifications

Unknown Date (has links)
The efforts addressed in this thesis refer to assaying the extent of local features in 2D-images for the purpose of recognition and classification. It is based on comparing a test-image against a template in binary format. It is a bioinformatics-inspired approach pursued and presented as deliverables of this thesis as summarized below: 1. By applying the so-called 'Smith-Waterman (SW) local alignment' and 'Needleman-Wunsch (NW) global alignment' approaches of bioinformatics, a test 2D-image in binary format is compared against a reference image so as to recognize the differential features that reside locally in the images being compared 2. SW and NW algorithms based binary comparison involves conversion of one-dimensional sequence alignment procedure (indicated traditionally for molecular sequence comparison adopted in bioinformatics) to 2D-image matrix 3. Relevant algorithms specific to computations are implemented as MatLabTM codes 4. Test-images considered are: Real-world bio-/medical-images, synthetic images, microarrays, biometric finger prints (thumb-impressions) and handwritten signatures. Based on the results, conclusions are enumerated and inferences are made with directions for future studies. / by Deepti Pappusetty. / Thesis (M.S.C.S.)--Florida Atlantic University, 2011. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2011. Mode of access: World Wide Web.
3

Joint models for longitudinal and survival data

Yang, Lili 11 July 2014 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / Epidemiologic and clinical studies routinely collect longitudinal measures of multiple outcomes. These longitudinal outcomes can be used to establish the temporal order of relevant biological processes and their association with the onset of clinical symptoms. In the first part of this thesis, we proposed to use bivariate change point models for two longitudinal outcomes with a focus on estimating the correlation between the two change points. We adopted a Bayesian approach for parameter estimation and inference. In the second part, we considered the situation when time-to-event outcome is also collected along with multiple longitudinal biomarkers measured until the occurrence of the event or censoring. Joint models for longitudinal and time-to-event data can be used to estimate the association between the characteristics of the longitudinal measures over time and survival time. We developed a maximum-likelihood method to joint model multiple longitudinal biomarkers and a time-to-event outcome. In addition, we focused on predicting conditional survival probabilities and evaluating the predictive accuracy of multiple longitudinal biomarkers in the joint modeling framework. We assessed the performance of the proposed methods in simulation studies and applied the new methods to data sets from two cohort studies. / National Institutes of Health (NIH) Grants R01 AG019181, R24 MH080827, P30 AG10133, R01 AG09956.

Page generated in 0.1109 seconds