Global ETD Search

21	Identification of Publications on Disordered Proteins from PubMed Sirisha, Peyyeti 07 August 2012 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / The literature corresponding to disordered proteins has been on a rise. As the number of publications increase, the time and effort needed to manually identify the relevant publications and protein information to add to centralized repository (called DisProt) is becoming arduous and critical. Existing search facilities on PubMed can retrieve a seemingly large number of publications based on keywords and does not have any support for ranking them based on the probability of the protein names mentioned in a given abstract being added to DisProt. This thesis explores a novel system of using disorder predictors and context based dictionary methods to quickly identify publications on disordered proteins from the PubMed database. NLProt, which is built around Support Vector Machines, is used to identify protein names and PONDR-FIT which is an Artificial Neural Network based meta- predictor is used for identifying protein disorder. The work done in this thesis is of immediate significance in identifying disordered protein names. We have tested the new system on 100 abstracts from DisProt [these abstracts were found to be relevant to disordered proteins and were added to DisProt manually by the annotators.] This system had an accuracy of 87% on this test set. We then took another 100 recently added abstracts from PubMed and ran our algorithm on them. This time it had an accuracy of 68%. We suggested improvements to increase the accuracy and believe that this system can be applied for identifying disordered proteins from literature. DisProt, Database, Software Tool Proteins -- Analysis Bioinformatics Database searching Genomics -- Data processing
22	Assessment of genome visualization tools relevant to HIV genome research: development of a genome browser prototype. Boardman, Anelda Philine January 2004 (has links) <p>Over the past two decades of HIV research, effective vaccine candidates have been elusive. Traditionally viral research has been characterized by a gene -by-gene approach, but in the light of the availability of complete genome sequences and the tractable size of the HIV genome, a genomic approach may improve insight into the biology and epidemiology of this virus. A genomic approach to finding HIV vaccine candidates can be facilitated by the use of genome sequence visualization. Genome browsers have been used extensively by various groups to shed light on the biology and evolution of several organisms including human, mouse, rat, Drosophila and C.elegans. Application of a genome browser to HIV genomes and related annotations can yield insight into forces that drive evolution, identify highly conserved regions as well as regions that yields a strong immune response in patients, and track mutations that appear over the course of infection. Access to graphical representations of such information is bound to support the search for effective HIV vaccine candidates. This study aimed to answer the question of whether a tool or application exists that can be modified to be used as a platform for development of an HIV visualization application and to assess the viability of such an implementation. Existing applications can only be assessed for their suitability as a basis for development of an HIV genome browser once a well-defined set of assessment criteria has been compiled.</p> AIDS (Disease), Genetic aspects HIV (Viruses), Genetic aspects HIV (Viruses), Data processing AIDS (Disease), Data processing Genomics, Data processing.
23	Assessment of genome visualization tools relevant to HIV genome research: development of a genome browser prototype. Boardman, Anelda Philine January 2004 (has links) <p>Over the past two decades of HIV research, effective vaccine candidates have been elusive. Traditionally viral research has been characterized by a gene -by-gene approach, but in the light of the availability of complete genome sequences and the tractable size of the HIV genome, a genomic approach may improve insight into the biology and epidemiology of this virus. A genomic approach to finding HIV vaccine candidates can be facilitated by the use of genome sequence visualization. Genome browsers have been used extensively by various groups to shed light on the biology and evolution of several organisms including human, mouse, rat, Drosophila and C.elegans. Application of a genome browser to HIV genomes and related annotations can yield insight into forces that drive evolution, identify highly conserved regions as well as regions that yields a strong immune response in patients, and track mutations that appear over the course of infection. Access to graphical representations of such information is bound to support the search for effective HIV vaccine candidates. This study aimed to answer the question of whether a tool or application exists that can be modified to be used as a platform for development of an HIV visualization application and to assess the viability of such an implementation. Existing applications can only be assessed for their suitability as a basis for development of an HIV genome browser once a well-defined set of assessment criteria has been compiled.</p> AIDS (Disease), Genetic aspects HIV (Viruses), Genetic aspects HIV (Viruses), Data processing AIDS (Disease), Data processing Genomics, Data processing.
24	Analysis of integration sites of transgenic sheep generated by lentiviral vectors using next-generation sequencing technology Chen, Yu-Hsiang 31 July 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / The development of new methods to carry out gene transfer has many benefits to several fields, such as gene therapy, agriculture and animal health. The newly established lentiviral vector systems further increase the efficiency of gene transfer dramatically. Some studies have shown that lentiviral vector systems enhance efficiency over 10-fold higher than traditional pronuclear injection. However, the timing for lentiviral vector integration to occur remains unclear. Integrating in different stages of embryogenesis might lead to different integration patterns between tissues. Moreover, in our previous study we found that the vector copy number in transgenic sheep varied, some having one or more copies per cells while other animals having less than one copy per cell suggesting mosaicism. Here I hypothesized that injection of a lentiviral vector into a single cell embryo can lead to integration very early in embryogenesis but can also occur after several cell divisions. In this study, we focus on investigating integration sites in tissues developing from different germ layers as well as extraembryonic tissues to determine when integration occurs. In addition, we are also interested in insertional mutagenesis caused by viral sequence integration in or near gene regions. We utilize linear amplification-mediated polymerase chain reaction (LAM-PCR) and next- generation sequencing (NGS) technology to determine possible integration sites. In this study, we found the evidence based on a series of experiments to support my hypothesis, suggesting that integration event also happens after several cell divisions. For insertional mutagenesis analysis, the closest genes can be found according to integration sites, but they are likely too far away from the integration sites to be influenced. A well-annotated sheep genome database is needed for insertional mutagenesis analysis. Genetic transformation -- Methodology Genetic vectors -- Research Genetic engineering -- Methodology Gene expression -- Analysis Sheep -- Physiology Mosaicism Lentiviruses Molecular cloning Cell division Mutagenesis Gene mapping Nucleotide sequence Genomics -- Data processing Genomics -- Technique Embryology
25	Implementation of a Laboratory Information Management System To Manage Genomic Samples Witty, Derick 05 September 2013 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / A Laboratory Information Management Systems (LIMS) is designed to manage laboratory processes and data. It has the ability to extend the core functionality of the LIMS through configuration tools and add-on modules to support the implementation of complex laboratory workflows. The purpose of this project is to demonstrate how laboratory data and processes from a complex workflow can be implemented using a LIMS. Genomic samples have become an important part of the drug development process due to advances in molecular testing technology. This technology evaluates genomic material for disease markers and provides efficient, cost-effective, and accurate results for a growing number of clinical indications. The preparation of the genomic samples for evaluation requires a complex laboratory process called the precision aliquotting workflow. The precision aliquotting workflow processes genomic samples into precisely created aliquots for analysis. The workflow is defined by a set of aliquotting scheme attributes that are executed based on scheme specific rules logic. The aliquotting scheme defines the attributes of each aliquot based on the achieved sample recovery of the genomic sample. The scheme rules logic executes the creation of the aliquots based on the scheme definitions. LabWare LIMS is a Windows® based open architecture system that manages laboratory data and workflow processes. A LabWare LIMS model was developed to implement the precision aliquotting workflow using a combination of core functionality and configured code. Genomic LabWare LIMS Laboratory Workflow Implementation Configuration Workflow management systems Software configuration management Computer logic Genomics -- Data processing Computer architecture -- Data processing Data structures (Computer science) Artificial intelligence -- Analysis Medical informatics -- Research
26	Protein function prediction by integrating sequence, structure and binding affinity information Zhao, Huiying 03 February 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / Proteins are nano-machines that work inside every living organism. Functional disruption of one or several proteins is the cause for many diseases. However, the functions for most proteins are yet to be annotated because inexpensive sequencing techniques dramatically speed up discovery of new protein sequences (265 million and counting) and experimental examinations of every protein in all its possible functional categories are simply impractical. Thus, it is necessary to develop computational function-prediction tools that complement and guide experimental studies. In this study, we developed a series of predictors for highly accurate prediction of proteins with DNA-binding, RNA-binding and carbohydrate-binding capability. These predictors are a template-based technique that combines sequence and structural information with predicted binding affinity. Both sequence and structure-based approaches were developed. Results indicate the importance of binding affinity prediction for improving sensitivity and precision of function prediction. Application of these methods to the human genome and structure genome targets demonstrated its usefulness in annotating proteins of unknown functions and discovering moon-lighting proteins with DNA,RNA, or carbohydrate binding function. In addition, we also investigated disruption of protein functions by naturally occurring genetic variations due to insertions and deletions (INDELS). We found that protein structures are the most critical features in recognising disease-causing non-frame shifting INDELs. The predictors for function predictions are available at http://sparks-lab.org/spot, and the predictor for classification of non-frame shifting INDELs is available at http://sparks-lab.org/ddig. protein function Artificial intelligence Algorithms Protein-protein interactions Genomics -- Data processing Proteins -- Analysis Expert systems (Computer science) Data mining Bioinformatics -- Research DNA-protein interactions RNA-protein interactions Carbohydrates
27	Data analysis and creation of epigenetics database Desai, Akshay A. 21 May 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / This thesis is aimed at creating a pipeline for analyzing DNA methylation epigenetics data and creating a data model structured well enough to store the analysis results of the pipeline. In addition to storing the results, the model is also designed to hold information which will help researchers to decipher a meaningful epigenetics sense from the results made available. Current major epigenetics resources such as PubMeth, MethyCancer, MethDB and NCBI’s Epigenomics database fail to provide holistic view of epigenetics. They provide datasets produced from different analysis techniques which raises an important issue of data integration. The resources also fail to include numerous factors defining the epigenetic nature of a gene. Some of the resources are also struggling to keep the data stored in their databases up-to-date. This has diminished their validity and coverage of epigenetics data. In this thesis we have tackled a major branch of epigenetics: DNA methylation. As a case study to prove the effectiveness of our pipeline, we have used stage-wise DNA methylation and expression raw data for Lung adenocarcinoma (LUAD) from TCGA data repository. The pipeline helped us to identify progressive methylation patterns across different stages of LUAD. It also identified some key targets which have a potential for being a drug target. Along with the results from methylation data analysis pipeline we combined data from various online data reserves such as KEGG database, GO database, UCSC database and BioGRID database which helped us to overcome the shortcomings of existing data collections and present a resource as complete solution for studying DNA methylation epigenetics data. database,epigenetics,data analysis Epigenesis -- Databases Adenocarcinoma -- Genetic aspects Lungs -- Cancer -- Databases Biological systems -- Analysis Genomics -- Data processing Browsers (Computer programs) Genomics -- Mathematical models Computational biology -- Databases

Page generated in 0.0514 seconds