Global ETD Search

311	Application and integration of bioinformatic strategies towards central and peripheral proteomic profiling for diagnosis and drug discovery in schizophrenia Cox, David Alan January 2018 (has links) Proteomic profiling studies of schizophrenia have the potential to shed further light on this debilitating and poorly understood condition which affects up to 1% of the world’s population. However, recent studies suggest that the field of proteomics in general has been hindered by poor application of bioinformatic strategies, contributing to the failure of many findings to validate. In the context of schizophrenia research, there is therefore a need for a more robust application and integration of existing statistical approaches to proteomic datasets, as well as the development of new methodologies to offer solutions to current challenges. The aims of this thesis were multi-fold. Many studies have stipulated the need for new diagnostic and prognostic strategies to aid psychiatrists, particularly in predicting disease conversion from the prodromal phase. Proteomic data from serum samples was used to investigate the potential for statistical models based on biomarker panels to offer a new and clinically relevant approach. Models were trained based on either differential protein (chapter 3) or peptide (chapter 4) expression levels between schizophrenia patients and controls, as measured through multiplex immunoassay or targeted mass spectrometry technologies. In chapter 4, an SVM model based on 21 peptides was identified that is both highly sensitive and specific as a diagnostic and prognostic tool in symptomatic individuals. Furthermore, in recent years, few preclinical innovations have been made in schizophrenia research in either in vitro or in vivo studies, resulting in a standstill in the development of treatments. In chapters 5 and 6 of this thesis, proteomic information from a novel cellular model of schizophrenia was analyzed. In chapter 5, cell signalling alterations in vitro were identified which may underpin dysfunctional microglial activation in at least a subgroup of patients, thus representing new drug targets in the CNS. Subsequent analysis identified compounds which have the potential to ameliorate the observed changes. Lastly, in chapters 7 and 8, a novel systems biology methodology was developed for the functional comparison of proteomic changes in brain tissue from existing preclinical rodent models of psychiatric disorders to those in human post-mortem samples, providing a new means of overcoming some of the translational hurdles of preclinical psychiatric research. The application of different bioinformatic strategies to a range of proteomic datasets in this thesis has yielded a number of findings which have enhanced the understanding of schizophrenia pathophysiology and provide a platform for future studies towards the goal of improving outcomes for patients affected by this disorder.
312	Dental Decision Support and Training System for Attachment Selection in Removable Partial Denture Design Alturki, Wesam 29 March 2019 (has links) <p> <b>Background:</b> Attachment selection in removable partial dentures (RPD) design is considered one of the most challenging treatment modalities in dentistry. Any error that occur during attachment selection due to lack of proper knowledge, overwhelming number of attachments, mistreatment, multiple adjustments and repairs could result in adverse clinical consequences, and significant inconvenience to the patient as well as financial implication to both patient and provider. Attachment selection is indeed very challenging for several reasons. Firstly, the topic itself has not been widely researched and published in dental literature, and therefore the best attachment selection still remains an area prone to high error rates in decision-making. Secondly, the complexity of the topic and lack of proper knowledge that requires sound knowledge of attachment principle, which spans multiple dental displaces of endodontic, orthodontics, periodontics and prosthodontics. Furthermore, now there are an over whelming number of attachments available in the market due to high patient demand for cosmetic and aesthetic dental enhancements. It is therefore extremely difficult for dental practitioners to readily recall an extensive list of factors that determine an appropriate attachment for RPD design. This is more as for dental education students, especially for students, residents, and less experienced clinician who may not possess the adequate education, training and competencies. Although clinical experts in the area of RPD design and attachment experience and skills may be able to assist with knowledge and years of experience they may not always be around or readily available. To address this problem and gab in the education and training of dental students, residents and practitioners seeking continuing education, we have developed a clinical support and training system for RPD attachment design and implementation based on dental experts’ knowledge and literature evidence-based clinical and practice guidelines. </p><p> <b>Methodology:</b> The RPD attachment clinical decision support system was developed using Exsys Corvid Core software. The knowledge based of the system was setup using dental experts’ and literature evidence-based practice guidelines. In all the knowledge base was successfully loaded with more than 100 rules representing many different clinical scenarios for variable types of attachment selection in RPD. For any new input attachment case, based on the information entered by the user, the system comes up with an appropriate evidence-based recommendation and treatment plan. To ensure that the clinical decision support and training system was indeed fully capable of training and educating dental students and residents it was validated by nine expert prosthodontics using a survey style questionnaire on the various aspects of the setup and functionality of the system. The questionnaire results were statistically evaluated using Cronbach’s Alpha Coefficient Test. </p><p> <b>Results:</b> The Cronbach’s Alpha reliability coefficient was 0.893, which represent a good internal consistency and indicates an overall agreement among the prosthodontic experts as to the need and viability of the system for training dental students and residents in the area of RPD attachment design. Likewise, the results of the validation questionnaire showed that all prosthodontics agreed that the system contained all of the most relevant factors for attachment selection in RPD design ensuring its utility for training and education in a real-world practice. </p><p> <b>Conclusion:</b> The clinical decision support and training system for RPD attachment design was successfully developed using Exsys Corvid Core software. Expert prosthodontists concurred that the system can be effectively employed for training dental student, inexperienced dentists and residents to select an appropriate attachment for RPD. It can be used to complement traditional teaching methods even in the absence of patients as part of a dental degree curriculum.</p><p>
313	Large Scale Machine Learning in Biology Raj, Anil January 2011 (has links) Rapid technological advances during the last two decades have led to a data-driven revolution in biology opening up a plethora of opportunities to infer informative patterns that could lead to deeper biological understanding. Large volumes of data provided by such technologies, however, are not analyzable using hypothesis-driven significance tests and other cornerstones of orthodox statistics. We present powerful tools in machine learning and statistical inference for extracting biologically informative patterns and clinically predictive models using this data. Motivated by an existing graph partitioning framework, we first derive relationships between optimizing the regularized min-cut cost function used in spectral clustering and the relevance information as defined in the Information Bottleneck method. For fast-mixing graphs, we show that the regularized min-cut cost functions introduced by Shi and Malik over a decade ago can be well approximated as the rate of loss of predictive information about the location of random walkers on the graph. For graphs drawn from a generative model designed to describe community structure, the optimal information-theoretic partition and the optimal min-cut partition are shown to be the same with high probability. Next, we formulate the problem of identifying emerging viral pathogens and characterizing their transmission in terms of learning linear models that can predict the host of a virus using its sequence information. Motivated by an existing framework for representing biological sequence information, we learn sparse, tree-structured models, built from decision rules based on subsequences, to predict viral hosts from protein sequence data using multi-class Adaboost, a powerful discriminative machine learning algorithm. Furthermore, the predictive motifs robustly selected by the learning algorithm are found to show strong host-specificity and occur in highly conserved regions of the viral proteome. We then extend this learning algorithm to the problem of predicting disease risk in humans using single nucleotide polymorphisms (SNP) -- single-base pair variations -- in their entire genome. While genome-wide association studies usually aim to infer individual SNPs that are strongly associated with disease, we use popular supervised learning algorithms to infer sufficiently complex tree-structured models, built from single-SNP decision rules, that are both highly predictive (for clinical goals) and facilitate biological interpretation (for basic science goals). In addition to high prediction accuracies, the models identify 'hotspots' in the genome that contain putative causal variants for the disease and also suggest combinatorial interactions that are relevant for the disease. Finally, motivated by the insufficiency of quantifying biological interpretability in terms of model sparsity, we propose a hierarchical Bayesian model that infers hidden structured relationships between features while simultaneously regularizing the classification model using the inferred group structure. The appropriate hidden structure maximizes the log-probability of the observed data, thus regularizing a classifier while increasing its predictive accuracy. We conclude by describing different extensions of this model that can be applied to various biological problems, specifically those described in this thesis, and enumerate promising directions for future research. Mathematics Bioinformatics Computer science
314	Genome-wide Predictive Simulation on the Effect of Perturbation and the Cause of Phenotypic variations with Network Biology Approach Jang, In Sock January 2012 (has links) Thanks to modern high-throughput technologies such as microarray-based gene expression profiling, a large amount of molecular profile data have been generated in several disease related contexts. Despite the fact that these data likely contain systems-level information about disease regulation, revealing the underlying dynamics between genes and mechanisms of gene regulation in genome wide way remains a major challenge. Understanding these mechanisms in genome-wide fashion and the resulting dynamical behavior is a key goal of the nascent field of systems biology. One approach to dissect the logic of the cell, is to use reverse engineering algorithms that infer regulatory interactions form molecular profile data. In this context, use of information theoretic approaches has been very successful: for instance, the ARACNe algorithm has been able to successfully infer transcriptional interactions between transcription factors and their target genes; similarly, the MINDy algorithm has identified post-translational modulators of transcription factor activity by multivariate analysis of large gene expression profile datasets. Many methods have been proposed to improve ARACNe both from a computational efficiency perspective and in terms of increasing the accuracy of the predicted interactions. Yet, the main core of ARACNe, i.e., the data processing inequality (DPI), has remained virtually unaffected even though modern information theory has extended the DPI theorem into higher-order interactions. First, we introduce an improvement of ARACNe, hARACNe, which recursively applies a higher-order DPI analysis. We show that the new algorithm successfully detects false positive feed-forward loops involving more than three genes. Second, we extend the MINDy algorithm using co-information as a novel metric, thus replacing the conditional mutual information and significantly improving the algorithm"™s predictions. Largely, two ultimate goals of systems perturbation studies are to reveal how human diseases are connected with the genes, and to find regulatory mechanism that determine disease cell behavior. However, these goals remain daunting: even the most talented researchers still have to rely on laborious genetic screens and very simplified hypotheses about effects of given perturbation have been experimentally validated and roughly analyzed with very limited regulatory sub-network such as pathway. To overcome these limitations, use of gene regulatory network is explored in this thesis research. Specifically, we propose creation of a new algorithm that can accurately predict cell state in genome-wide fashion following perturbation of individual genes, such as from silencing or ectopic expression experiments. Furthermore, experimentally validated methods to predict genome-wide changes in a cellular system following a genetic perturbation (e.g., gene silencing or ectopic expression) are still unavailable, and even though phenotypic variations are experimentally profiled and gene signatures are selected by being statistically tested, finding the exact regulator which systematically causes significant variations of gene signature is still quite challenging. In this research, I introduce and experimentally validate a probabilistic Bayesian method to simulate the propagation of genetic perturbations on integrated gene regulatory networks inferred by the hARACNe and coMINDy algorithms from human B cell data. With the same predictive framework, we also computationally predict the master driver (regulator) that is most likely to have produced the observed variations in gene expression levels; these studies as a systematized pre-screening process before genetic manipulation. I predict in silico the effect of silencing of several genes as well as the cause of phenotypic variations. Performance analysis, tested by Gene Set Enrichment Analysis (GSEA), shows that the new methods are highly predictive, thus providing an initial step toward building predictive probabilistic regulatory models, which may be applicable as pre-screening steps in perturbation studies. Electrical engineering Bioinformatics
315	Topics in Genomic Signal Processing Jajamovich, Guido Hugo January 2012 (has links) Genomic information is digital in its nature and admits mathematical modeling in order to gain biological knowledge. This dissertation focuses on the development and application of detection and estimation theories for solving problems in genomics by describing biological problems in mathematical terms and proposing a solution in this domain. More specifically, a novel framework for hypothesis testing is presented, where it is desired to decide among multiple hypotheses and where each hypothesis involves unknown parameters. Within this framework, a test is developed to perform both detection and estimation jointly in an optimal sense. The proposed test is then applied to the problem of detecting and estimating periodicities in DNA sequences. Moreover, the problem of motif discovery in DNA sequences is presented, where a set of sequences is observed and it is needed to determine which sequences contain instances (if any) of an unknown motif and estimate their positions. A statistical description of the problem is used and a sequential Monte Carlo method is applied for the inference. Finally, the phasing of haplotypes for diploid organisms is introduced, where a novel mathematical model is proposed. The haplotypes that are used to reconstruct the observed genotypes of a group of unrelated individuals are detected and the haplotype pair for each individual in the group is estimated. The model translates a biological principle, the maximum parsimony principle, to a sparseness condition. Electrical engineering Bioinformatics
316	Integrating Functional Genomics with Systems Biology to Discover Drivers and Therapeutic Targets of Human Malignancies Yu, Jiyang January 2012 (has links) Genome-wide RNAi screening has emerged as a powerful tool for loss-of-function studies that may lead to therapeutic target discovery for human malignancies in the era of personalized medicine. However, due to high false-positive and false-negative rates arising from noise of high-throughput measurements and off-target effects, powerful computational tools and additional knowledge are much needed to analyze and complement it. Availability of high-throughput genomic data including gene expression profiles, copy number variations from large-sampled primary patients and cell lines allows us to tackle underlying drivers causally associated with tumorigenesis or drug-resistance. In my dissertation, I have developed a framework to integrate functional RNAi screens with systems biology of cancer genomics to tailor potential therapeutics for reversal of drug-resistance or treatment of aggressive tumors. I developed a series of algorithms and tools to deconvolute, QC and post-analyze high-throughput shRNA screening data by next-generation sequencing technology (shSeq), particularly a novel Bayesian hierarchical modeling approach to integrate multiple shRNAs targeting the same gene, which outperforms existing methods. In parallel, I developed a systems biology algorithm, NetBID2, to infer disease drivers from high-throughput genomic data by reverse-engineering network and Bayesian inference, which is able to detect hidden drivers that traditional methods fail to find. Integrating NetBID2 with functional RNAi screens, I have identified known and novel driver-type therapeutic targets in various disease contexts. For example, I discovered that AKT1 is a driver for glucocorticoid (GC) resistance, a problem in the treatment of T-ALL. The inhibition of AKT1 was validated to reverse GC-resistance. Additionally, upon silencing predicted master regulators of GC resistance with shRNA screens, 13 out of 16 were validated to significantly overcome resistance. In breast cancer, I discovered that STAT3 is required for transformation of HER2+ breast cancer, an aggressive breast tumor subtype. The suppression of STAT3 was confirmed in vitro and in vivo to be an effective therapy for HER2+ breast cancer. Moreover, my analysis revealed that STAT3 silencing only works in ER- cases. Using my framework, I have also identified potential therapeutic targets for ABC or GCB-type DLBCL and subtype-based breast cancer that are currently being validated. Bioinformatics Biology--Classification Medicine
317	A Novel Platform to Perform Cancer-Relevant Synthetic Dosage Lethality Screens in Saccharomyces cerevisiae Dittmar, John January 2013 (has links) The most significant challenge in developing cancer therapies is to selectively kill cancer cells while leaving normal cells unharmed. One approach to specifically target tumors is to exploit gene expression changes specific to cancer cells. For example, synthetic dosage lethal (SDL) interactions occur when increased gene dosage is lethal only in combination with a specific gene disruption. Since cancer cells often specifically over-express a host of gene products, discovering SDL interactions could reveal new therapeutic targets for cancer treatment. In this situation, cancer cells over-expressing a particular gene would be killed when another gene's expression was silenced. Critically, normal cells would be unaffected because the gene whose expression is knocked down is not essential in normal cells.In this study, I describe the use of Saccharomyces cerevisiae as a model system to speed the discovery of cancer relevant SDL interactions. This approach is based on the observation that many genes that are over-expressed in human cancers are involved in essential functions, such as regulation of the cell cycle and DNA replication. Consequently, many of these genes are highly conserved from humans to Saccharomyces cerevisiae and therefore, any SDL discovered through theses screens with a mammalian ortholog is a potential therapeutic cancer target. Using Saccharomyces cerevisiae as a model has several advantages over performing similar screens in mammalian systems, including reduced costs and faster results. For example, a novel technique called Selective Ploidy Ablation (SPA) allows for the completion of a full genome-wide SDL screen in only 6 days. Using SPA to perform SDL screens produces tens of thousands of yeast colonies that must be analyzed appropriately to identify affected mutants. To aid in the data processing, I developed ScreenMill, a suite of software tools that allows the quantification and review of high-throughput screen data. As a companion to ScreenMill, I also developed a tool termed CLIK (Cutoff Linked to Interaction Knowledge), which uses the wealth of known yeast genetic and physical interactions instead of statistical models to inform screen cutoff and evaluate screen results. Together these tools aided in the completion and evaluation of 23 cancer relevant yeast SDL screens. From these screens I prioritized a list of validated SDLs that have human orthologs and thus, represent potential targets for cancer treatment.To understand the mechanism underlying the SDL interaction discovered in one of the screens performed, I analyzed the results from over-expressing NPL3 in more detail. The Npl3 protein plays a role in mRNA processing and translation and its human ortholog, SFRS1 (ASF), is involved in pre-mRNA splicing, mRNA nuclear export and translation and is also up-regulated in PTEN deficient breast cancer. In yeast, over 50 novel SDL interactions with NPL3 were discovered including several with deletions of lysine deactylases (KDACs). These are particularly interesting because KDACs are evolutionarily conserved and are currently being explored as potential anti-cancer drug targets. Furthermore, using several mutant alleles of NPL3, I show that most of the SDL interactions defined are due to its role in the nucleus and are linked to the acetylation state of the Npl3 protein. Thus, by performing SDL screens in yeast, I demonstrate their utility in defining potential cancer relevant drug targets, as well as uncovering novel gene functions. Bioinformatics Biology Genetics
318	Analyzing Genomic Studies and a Screen for Genes that Suppress Information Loss During DNA Damage Repair Pierce, Steven January 2013 (has links) This thesis is concerned with the means by which cells preserve genetic information and, in particular, with the competition between different DNA damage responses. DNA is continuously damaged and imperfect repair can have extremely detrimental effects. Double strand breaks are the most severe form of damage and can be repaired in several different ways or countered by other cellular responses. DNA context is important; cell cycle, chromosomal structure, and sequence all can make DSBs more likely or more problematic to repair. Saccharomyces cerevisiae is very resilient to DSBs and primarily uses a process called homologous recombination to repair DNA damage. To further our understanding of how S. cerevisiae efficiently uses homologous recombination, and thereby minimizes genetic degradation, I performed a screen for genes affecting this process. >In devising this study, I set out to quickly quantify the contribution of every non-essential yeast gene to suppressing genetic rearrangements and deletions at a single locus. Before I began I did not fully appreciate how variable and contingent this type of recombination phenotype could be. Accounting for the complex and changing recombination baseline across many tests became a significant effort unto itself. The requirements of the experimental protocols precluded the use of traditional recombination rate calculation methods. Searching for the means to compare the utility of normalizations and to validate my results, I sought general approaches for analyzing genome wide screen data and coordinating interpretation with existing knowledge. It was advantageous during this study to develop novel analysis tools. The second chapter describes one of these tools we developed, a technique called CLIK (Cutoff Linked to Interaction Knowledge). CLIK uses preexisting biological information to evaluate screen performance and to empirically define a significance threshold. This technique was used to analyze the screen results described in chapter three. The screen in chapter three represents the primary work of this dissertation. Its purpose was to identify genes and biological processes important for the suppression of recombination between DNA tandem repeats in yeast. By searching for gene deletion strains that show an increase in non-conservative single strand annealing, I found that many genetic backgrounds could induce altered recombination frequencies, with genes involved in DNA repair, mitochondria structural and ribosomal, and chromatin remodeling genes being most important for minimizing the loss of genetic information by HR. In addition, I found that the remodeling complex INO80 subunits, ARP8 and IES5 are significant in suppressing SSA. Biology Cytology Bioinformatics
319	Computational Contributions Towards Scalable and Efficient Genome-wide Association Methodology Prabhu, Snehit January 2013 (has links) Genome-wide association studies are experiments designed to find the genetic bases of physical traits: for example, markers correlated with disease status by comparing the DNA of healthy individuals to the DNA of affecteds. Over the past two decades, an exponential increase in the resolution of DNA-testing technology coupled with a substantial drop in their cost have allowed us to amass huge and potentially invaluable datasets to conduct such comparative studies. For many common diseases, datasets as large as a hundred thousand individuals exist, each tested at million(s) of markers (called SNPs) across the genome. Despite this treasure trove, so far only a small fraction of the genetic markers underlying most common diseases have been identified. Simply stated - our ability to predict phenotype (disease status) from a person's genetic constitution is still very limited today, even for traits that we know to be heritable from one's parents (e.g. height, diabetes, cardiac health). As a result, genetics today often lags far behind conventional indicators like family history of disease in terms of its predictive power. To borrow a popular metaphor from astronomy, this veritable "dark matter" of perceivable but un-locatable genetic signal has come to be known as missing heritability. This thesis will present my research contributions in two hotly pursued scientific hypotheses that aim to close this gap: (1) gene-gene interactions, and (2) ultra-rare genetic variants - both of which are not yet widely tested. First, I will discuss the challenges that have made interaction testing difficult, and present a novel approximate statistic to measure interaction. This statistic can be exploited in a Monte-Carlo like randomization scheme, making an exhaustive search through trillions of potential interactions tractable using ordinary desktop computers. A software implementation of our algorithm found a reproducible interaction between SNPs in two calcium channel genes in Bipolar Disorder. Next, I will discuss the functional enrichment pipeline we subsequently developed to identify sets of interacting genes underlying this disease. Lastly, I will talk about the application of coding theory to cost-efficient measurement of ultra-rare genetic variation (sometimes, as rare as just one individual carrying the mutation in the entire population). Bioinformatics Computer science
320	Probabilistic Reconstruction and Comparative Systems Biology of Microbial Metabolism Plata Caviedes, German January 2013 (has links) With the number of sequenced microbial species soon to be in the tens of thousands, we are in a unique position to investigate microbial function, ecology, and evolution on a large scale. In this dissertation I first describe the use of hundreds of in silico models of bacterial metabolic networks to study the long-term the evolution of growth and gene-essentiality phenotypes. The results show that, over billions of years of evolution, the conservation of bacterial phenotypic properties drops by a similar fraction per unit time following an exponential decay. The analysis provides a framework to generate and test hypotheses related to the phenotypic evolution of different microbial groups and for comparative analyses based on phenotypic properties of species. Mapping of genome sequences to phenotypic predictions -such as used in the analysis just described- critically relies on accurate functional annotations. In this context, I next describe GLOBUS, a probabilistic method for genome-wide biochemical annotations. GLOBUS uses Gibbs sampling to calculate probabilities for each possible assignment of genes to metabolic functions based on sequence information and both local and global genomic context data. Several important functional predictions made by GLOBUS were experimentally validated in Bacillus subtilis and hundreds more were obtained across other species. Complementary to the automated annotation method, I also describe the manual reconstruction and constraints-based analysis of the metabolic network of the malaria parasite Plasmodium falciparum. After careful reconciliation of the model with available biochemical and phenotypic data, the high-quality reconstruction allowed the prediction and in vivo validation of a novel potential antimalarial target. The model was also used to contextualize different types of genome-scale data such as gene expression and metabolomics measurements. Finally, I present two projects related to population genetics aspects of sequence and genome evolution. The first project addresses the question of why highly expressed proteins evolve slowly, showing that, at least for Escherichia coli, this is more likely to be a consequence of selection for translational efficiency than selection to avoid misfolded protein toxicity. The second project investigates genetic robustness mediated by gene duplicates in the context of large natural microbial populations. The analysis shows that, under these conditions, the ability of duplicated yeast genes to effectively compensate for the loss of their paralogs is not a monotonic function of their sequence divergence. Microbiology Bioinformatics Evolution (Biology)

Search results