Global ETD Search

1	[en] IDENTIFICATION OF PROTEIN SUBCELLULAR LOCALIZATION BY DEEP LEARNING TECHNIQUES / [pt] IDENTIFICAÇÃO DA LOCALIZAÇÃO SUBCELULAR DE PROTEÍNAS POR MEIO DE TÉCNICAS DE DEEP LEARNING ROBERTO BANDEIRA DE MELLO MORAIS DA SILVA 21 May 2020 (has links) [pt] As proteínas são macromoléculas biológicas compostas por cadeias de aminoácidos, presentes em praticamente todos os processos celulares, sendo essenciais para o correto funcionamento do organismo humano. Existem diversos estudos em torno do proteoma humano a fim de se identificar quais são as funções de cada proteína nas diferentes células, tecidos e órgãos do corpo humano. A classificação destas proteínas em diferentes formas, como por exemplo a localização subcelular, é importante para diversas aplicações da biomedicina. Com o avanço das tecnologias para obtenção de imagens das proteínas, tem-se que hoje estas são geradas em grande volume e mais rapidamente do que é possível classificá-las manualmente, o que torna importante o desenvolvimento de um classificador automático capaz de realizar esta classificação de maneira eficaz. Dessa forma, esta dissertação buscou desenvolver algoritmos capazes de realizar a classificação automática de padrões mistos de localização subcelular de proteínas, por meio do uso de técnicas de Deep Learning. Inicialmente, fez-se uma revisão da literatura em torno de redes neurais, Deep Learning e SVMs, e utilizou-se o banco de dados, publicamente disponíve, de imagens de células do Human Protein Atlas, para treinamento dos algoritmos de aprendizagem supervisionada. Diversos modelos foram desenvolvidos e avaliados, visando identificar aquele com melhor desempenho na tarefa de classificação. Ao longo do trabalho foram desenvolvidas redes neurais artificiais convolucionais de topologia LeNet, ResNet e um modelo híbrido ResNet-SVM, tendo sido treinadas ao todo 81 redes neurais diferentes, a fim de se identificar o melhor conjunto de hiper-parâmetros. As análises efetuadas permitiram concluir que a rede de melhor desempenho foi uma variante da topologia ResNet, que obteve em suas métricas de desempenho uma acurácia de 0,94 e uma pontuação F1 de 0,44 ao se avaliar o comportamento da rede frente ao conjunto de teste. Os resultados obtidos pela diferentes topologias analisadas foram detalhadamente avaliados e, com base nos resultados alcançados, foram sugeridos trabalhos futuros baseados em possíveis melhorias para as redes de melhor desempenho. / [en] Proteins are biological macromolecules composed of aminoacid chains, part of practically all cellular processes, being essential for the correct functioning of the human organism. There are many studies around the human protein aiming to identify the proteins’ functions in different cells, tissues and organs in the human body. The protein classification in many forms, such as the subcellular localization, is important for many biomedical applications. With the advance of protein image obtention technology, today these images are generated in large scale and faster than it is possible to manually classify them, which makes crucial the development of a system capable of classifying these images automatically and accurately. In that matter, this dissertation aimed to develop algorithms capable of automatically classifying proteins in mixed patterns of subcellular localization with the use of Deep Learning techniques. Initially, a literature review on neural networks, Deep Learning and SVMs, and a publicly available image database from the Human Protein Atlas was used to train the supervised learning algorithms. Many models were developed seeking the best performance in the classification task. Throughout this work, convolutional artificial neural networks of topologies LeNet, ResNet and a hybrid ResNet-SVM model were developed, with a total of 81 different neural networks trained, aiming to identify the best hyper-parameters. The analysis allowed the conclusion that the network with best performance was a ResNet variation, which obtained in its performance metrics an accuracy of 0.94 and an F1 score of 0.44 when evaluated against the test data. The obtained results of these topologies were detailedly evaluated and, based on the measured results, future studies were suggested based on possible improvements for the neural networks that had the best performances. [pt] REDE NEURAL [pt] HUMAN PROTEIN ATLAS [pt] MULTI CLASSE [pt] APRENDIZADO PROFUNDO [pt] CLASSIFICACAO [en] NEURAL NETWORKS [en] HUMAN PROTEIN ATLAS [en] MULTI LABEL [en] DEEP LEARNING [en] CLASSIFICATION
2	Methods for Generation and Characterization of Monospecific Antibodies Rockberg, Johan January 2008 (has links) Recent advances in biotechnology have generated possibilities to investigate and measure parts of life previously left for believers to explain. Utilizing the same book of recipes, the genome, our cells produce selections of proteins at a time and thereby niche into a multitude of specialized cell types, tissues and organs comprising our body. Knowledge of the precise protein composition in a given organ at normal and disease condition would be of invaluable importance, both for identification of disease causes and the design of new pharmaceuticals, as well as for a deeper understanding of the processes of life. This doctoral thesis describes the start and progress of a visionary project (HPR) to localize all human proteins in our body, with emphasis on the generation and characterization of antibodies used as protein targeting missiles. To facilitate the identification of one human protein in a complex environment like our body, it is of significant importance to have precise and specific means of detection. The first two papers (I-II), describe software developed for generation of monospecific antibodies satisfying such needs, using a set of rules for antigen optimization. Five years after project start a large amount of antibodies with documented characteristics have been generated. The third paper (III), illustrates an attempt to sieve these antibody characteristics to develop a tool, for further improvement of antigen selection, based on the correlation between antigen sequence and amount of specific antibody generated.Having a panel of protein-specific antibodies is a possession of a great value, not only for localization studies, but also as possible target-directed pharmaceuticals. In such cases, knowledge of the precise epitope recognized by the antibody on its target protein, is an important aid, both for understanding its effect as well as unwanted cross-reactivity. Paper (IV) describes the development of a high-resolution method for epitope mapping of antibodies using staphylococcal display. An application of the method is described in the last paper (V) where it is used to map an anti-HER2 monospecific antibody with growth-inhibiting effects on breast cancer cells. The monospecific antibody was fractionated into separate populations and five novel epitopes related to cancer cell growth-inhibition was determined.Altogether these methods are valuable tools for generation and characterization of monospecific antibodies. / QC 20100907 antibody epitope mapping HER2 human protein atlas immunogenicity proteomics Bioengineering Bioteknik
3	Pooling of rabbit antisera to reduce lot to lot variability of polyclonal antibodies Ozimek, Paulina January 2017 (has links) No description available. human protein atlas polyclonal antibodies immunohistochemistry immunocytochemistry antobody purification Engineering and Technology Teknik och teknologier
4	Spatial proteome profiling of the compartments of the human cell using an antibody-based approach Wiking, Mikaela January 2017 (has links) The human cell is complex, with countless processes ongoing in parallel in specialized compartments, the organelles. Cells can be studied in vitro by using immortalized cell lines that represent cells in vivo to a varying degree. Gene expression varies between cell types and an average cell line expresses around 10,000-12,000 genes, as measured with RNA sequencing. These genes encode the cell’s proteome; the full set of proteins that perform functions in the cell. In paper I we show that RNA sequencing is a necessary tool for studying the proteome of the human cell. By studying the proteome, and proteins’ localization in the cell, information can be assembled on how the cell functions. Image-based methods allow for detailed spatial resolution of protein localization as well as enable the study of temporal events. Visualization of a protein can be accomplished by using either a cell line that is transfected to express the protein with a fluorescent tag, or by targeting the protein with an affinity reagent such as an antibody. In paper II we present subcellular data for a majority of the human proteins, showing that there is a high degree of complexity in regard to where proteins localize in the cell. Cellular energy is generated in the mitochondria, an important organelle that is also active in many other different functions. Today approximately only a third of the estimated mitochondrial proteome has been validated experimentally, indicating that there is much more to understand with regard to the functions of the mitochondria. In paper III we explore the mitochondrial proteome, based on the results of paper II. We also present a method for sublocalizing proteins to subcompartments that can be performed in a high-throughput manner. To conclude, this thesis shows that transcriptomics is a useful tool for proteome-wide subcellular localization, and presents high-resolution spatial distribution data for the human cell with a deeper analysis of the mitochondrial proteome. / <p>QC 20170512</p> Fluorescence imaging Human proteome Human Protein Atlas Immunofluorescence Mitochondria Organelles Spatial distribution Cell Biology Cellbiologi
5	Interrogation of Nucleic Acids by Parallel Threading Pettersson, Erik January 2007 (has links) Advancements in the field of biotechnology are expanding the scientific horizon and a promising era is envisioned with personalized medicine for improved health. The amount of genetic data is growing at an ever-escalating pace due to the availability of novel technologies that allow massively parallel sequencing and whole-genome genotyping, that are supported by the advancements in computer science and information technologies. As the amount of information stored in databases throughout the world is growing and our knowledge deepens, genetic signatures with significant importance are discovered. The surface of such a set in the data mining process may include causative- or marker single nucleotide polymorphisms (SNPs), revealing predisposition to disease, or gene expression signatures, profiling a pathological state. When targeting a reduced set of signatures in a large number of samples for diagnostic- or fine-mapping purposes, efficient interrogation and scoring require appropriate preparations. These needs are met by miniaturized and parallelized platforms that allow a low sample and template consumption. This doctoral thesis describes an attempt to tackle some of these challenges by the design and implementation of a novel assay denoted Trinucleotide Threading (TnT). The method permits multiplex amplification of a medium size set of specific loci and was adapted to genotyping, gene expression profiling and digital allelotyping. Utilizing a reduced number of nucleotides permits specific amplification of targeted loci while preventing the generation of spurious amplification products. This method was applied to genotype 96 individuals for 75 SNPs. In addition, the accuracy of genotyping from minute amounts of genomic DNA was confirmed. This procedure was performed using a robotic workstation running custom-made scripts and a software tool was implemented to facilitate the assay design. Furthermore, a statistical model was derived from the molecular principles of the genotyping assay and an Expectation-Maximization algorithm was chosen to automatically call the generated genotypes. The TnT approach was also adapted to profiling signature gene sets for the Swedish Human Protein Atlas Program. Here 18 protein epitope signature tags (PrESTs) were targeted in eight different cell lines employed in the program and the results demonstrated high concordance rates with real-time PCR approaches. Finally, an assay for digital estimation of allele frequencies in large cohorts was set up by combining the TnT approach with a second-generation sequencing system. Allelotyping was performed by targeting 147 polymorphic loci in a genomic pool of 462 individuals. Subsequent interrogation was carried out on a state-of-the-art massively parallelized Pyrosequencing instrument. The experiment generated more than 200,000 reads and with bioinformatic support, clonally amplified fragments and the corresponding sequence reads were converted to a precise set of allele frequencies. / QC 20100813 genotyping multiplex amplification trinucleotide threading single nucleotide polymorphism genotype calling Expectation-Maximization protein-epitope signature tag expression profiling Human Protein Atlas pooled genomic DNA Pyrosequencing 454 allelotyping association studies bioinformatics Bioengineering Bioteknik

1

Page generated in 0.0811 seconds