Spelling suggestions: "subject:"bioinformatic"" "subject:"bioinformatics""
51 |
Determinação pré-natal não invasiva de paternidade utilizando micro-haplótipos / Noninvasive prenatal paternity determination by microhaplotypesWang, Jaqueline Yu Ting 24 November 2017 (has links)
Testes de paternidade geralmente são feitos analisando amostras de DNA do suposto pai, mãe e criança. Para realizar esse exame antes de a criança nascer era preciso recorrer à métodos invasivos, tais como amniocentese e biópsia de vilo corial. Com a descoberta de DNA fetal livre (fcfDNA) no soro e plasma materno, hoje é possível utilizar técnicas que usem esse fcfDNA diminuindo assim os riscos à saúde do feto e da mãe. Testes de pa- ternidade que analisam Short Tandem Repeats (STRs) do fcfDNA, embora possíveis, não são confiáveis, pois muitas vezes há degradação do DNA. Por sua vez, Single Nucleotide Polymorphisms (SNPs) têm sido demonstrados como bons candidatos para identificação humana e podem ser obtidos de fragmentos pequenos de DNA (ou seja, mesmo com o DNA degradado). No entanto, SNPs possuem um número limitado de alelos diferentes (entre dois e quatro). Micro-haplótipos são segmentos cromossomais menores do que 200 pb (pares de bases), contendo dois ou mais SNPs que formam pelo menos três haplótipos distintos. Ao utilizá-los como marcadores genéticos, aumentamos o número de possíveis alelos formados a partir dos SNPs. Como o fcfDNA possui um tamanho de aproximada- mente 145 pb, isso é suficiente para conter micro-haplótipos que podem ser sequenciados usando tecnologia de Sequenciamento de Nova Geração (NGS). O objetivo desse projeto é determinar a probabilidade de paternidade usando SNPs dentro de micro-haplótipos. Os micro-haplótipos foram escolhidos com base em literatura prévia e as frequências relativas destes foram calculadas com base nos grupos étnicos dos dados do 1000 Genomes. Dados brutos de sequenciamento de três amostras de DNA são analisados: o suposto pai, a mãe e o plasma materno (mistura de DNA livre da mãe e do feto). Em seguida, desenvolvemos scripts para obter e analisar os genótipos do suposto pai e da mãe, para cada um dos micro-haplótipos escolhidos. Combinando informação genotípica, frequências populacio- nais e frações fetais (plasma), desenvolvemos um método para calcular a probabilidade de paternidade em casos de não exclusão da mesma. / Paternity tests are usually done by analyzing DNA samples from the alleged father, the mother, and the child. To perform this exam before the birth, invasive methods such as am- niocentesis and chorionic villus sampling are usually necessary. Fortunately, the discovery of fetal cell-free DNA (fcfDNA) in maternal plasma and serum, and the development of te- chniques to analyze this fcfDNA have allowed researchers to reduce the health risk for both fetus and mother. Although paternity tests that analyze Short Tandem Repeats (STRs) from fcfDNA are possible, they are not reliable because DNA degradation often occurs. Single Nucleotide Polymorphisms (SNPs) have been demonstrated as good candidates for human identification and they can be obtained from small DNA fragments (even from de- graded DNA). However, SNPs have a limited number of different alleles (between two and four). Microhaplotypes are chromosomal segments smaller than 200 bp (base pairs) con- taining two or more SNPs that form at least three distinct haplotypes. By using them as genetic markers, we increased the number of possible alleles formed from the SNPs. Since fcfDNA has approximately 145 bp, this is sufficient to contain microhaplotypes that can be sequenced using Next Generation Sequencing (NGS) technology. The aim of this project is to determine the probability of paternity using SNPs within microhaplotypes. Microha- plotypes were chosen based on previous literature review. The haplotype frequencies were calculated based on the ethnic groups from 1000 Genomes database. Raw DNA sequence data from three DNA samples were analyzed: the alleged father, the mother, and the maternal plasma (mixture of mother and fcfDNA). Then, we developed scripts to analyse and obtain the genotypes of the alleged father and mother, for each microhaplotype. By combining genotypic information, population frequencies, and fetal fractions (plasma), we developed a method to calculate the probability of paternity in cases of non-exclusion.
|
52 |
"Aplicação da bioinformática no estudo dos genes e enzimas envolvidos na síntese da goma fastidiana produzida pela xylella fastidiosa" / "Bioinformatic applied in studies of the enzymes involved in the biosynthesis of the exopolysaccharide, fastidian gum, produced by Xylella Fastidiosa"Muniz, João Renato Carvalho 25 April 2003 (has links)
Xylella fastidiosa é uma bactéria Gram-negativa, limitada ao xilema das plantas e o agente causador de diversas doenças em importantes plantações como citros, videiras, mirta, amêndoa, arbustos e café. Em citros, X. fastidiosa causa a Clorose Variegada dos Citros (CVC) ou amarelinho". Nove enzimas (GumB, C, D, E, F, H, J, K e M) estão envolvidas nas etapas biossintéticas de um polissacarídeo extracelular (EPS), chamado de goma fastidiana, um dos mecanismos envolvidos na patogênese da bactéria. Essas enzimas catalisam reações de adição de açúcares, polimerização e exportação do EPS através da membrana da bactéria. No presente trabalho, ferramentas de bioinformática foram utilizadas para o estudo e entendimento da biossíntese da goma fastidiana. As nove enzimas foram estudadas quanto ao seu conteúdo de estrutura secundária, análise de hidrofobicidade e das regiões transmembrânicas, classificação quanto as suas funções. A construção de modelos estruturais para as enzimas Gums através de comparação por homologia seqüencial mostrou ser um processo impossível, devido a falta de moléculas homólogas com estruturas tridimensionais conhecidas. Por outro lado, métodos de reconhecimento de enovelamento mostraram bons resultados e comparações entre as estruturas secundárias das enzimas Gums foram calculadas com a utilização dos programas GenThreader e THREADER 3.3. Modelos tridimensionais para as enzimas GumB, GumK, GumM, GumJ e GumC foram construídos com o programa MODELLER 6.0a e validados com o programa Procheck e VERIFY 3D. Para construção do modelo da GumH (enzima que catalisa a adição da GDP-manose em um lipídio carreador poliprenol), o GenThreader encontrou similaridades quando comparada a MurG (E.coli), 2-epimerase (E. coli), GtfB (Amycolatopsis orientalis) e beta-GT de fago T4. Todos os modelos são bastante semelhantes e compostos por dois domínios (alfa/beta), ambos similares ao motivo de ligação de nucleotídeos Rossmann fold e separados por uma fenda profunda, que, provavelmente, forma o sítio de ligação da GDP-manose. Estudos da interação entre proteína e substrato foram obtidos com a utilização do programa FLO. O alinhamento seqüencial da GumH com outras onze glicosiltransferases mostrou regiões bastante conservadas, incluindo o motivo EX7E presente no sítio de ligação do substrato na proteína. Considerações a respeito das interações do substrato GDP-manose com a enzima GumH e do mecanismo da reação foram feitas. Essas análises enfatizam o modelo obtido para a GumH, que representa a primeira estrutura proposta para as enzimas envolvidas na síntese da goma fastidiana. / Xylella fastidiosa is a xylem-dwelling, insect-transmitted gamma-protobacterium that causes pathogenicity in citrus plants and many others important crops such as grapevine, periwinkle, almond, oleander and coffee. In citrus plants, X. fastidiosa causes citrus variegated chlorosis (CVC) or amarelinho". Nine enzymes (GumB, C, D, E, F, H, J, K and M) are involved in the biosynthetic pathway of an exopolysaccharide (EPS) called fastidian gum which could be involved in the pathogenicity of the bacterium. These enzymes catalyses sugars addition reactions, polymerization and discharge of the EPS through the bacterias membrane. We have used bioinformatic tools to study these enzymes and to understand the gum biosynthesis. The nine enzymes were studied regarding to its secondary structure content, analysis of hidrophobicity and transmembrane regions, and yet function classification. The construction of structural models using sequential homology was shown to be impossible, due to the necessity of homologues molecules whose three-dimensional structures are known. On the other hand, pairwises comparisons of secondary structures showed good results and were realized with GenThreader and THREADER 3.3 programs. Three-dimensional structures to GumB, GumK, GumM, GumJ and GumC enzymes were constructed using MODELLER 6.0a and validated with Procheck and VERIFY 3D programs. To construct the model of GumH (enzyme that catalyse the addiction of a GDP-mannose on a polyprenol phosphate carrier), GenThreader found folding similarities when compared to MurG and UDP-Acetylglucosamine 2-Epimerase (from E. coli), GtfB (from Amycolatopsis orientalis) and beta-GT (from T4 phage). The models are very similar consisting of two alpha/beta open sheet domains, both alike in topology to the Rossmann nucleotide-binding folds, and separated by a deep cleft which probably forms the GDP-mannose binding site. Studies of the interaction between enzyme and docked substrate were carried out using the FLO program. The sequence alignment between GumH and another eleven glycosiltransferases showed several preserved regions including the EX7E motif present on the substrate binding site. The interactions between enzyme-GDP-mannose substrate and the mechanism of the reaction were studied. These analyses emphasize the three-dimensional model constructed for GumH that represents the first structural information for enzymes involved in fastidian gum synthesis.
|
53 |
Nouvelles approches bioinformatiques pour l'étude à grande échelle de l'évolution des activités enzymatiques / New bioinformatic approaches for the large-scale study of the evolution of the enzymatic activitiesPereira, Cécile 11 May 2015 (has links)
Cette thèse a pour objectif de proposer de nouvelles méthodes permettant l'étude de l'évolution du métabolisme. Pour cela, nous avons choisi de nous pencher sur le problème de comparaison du métabolisme de centaines de micro-organismes.Afin de comparer le métabolisme de différentes espèces, il faut dans un premier temps connaître le métabolisme de chacune de ces espèces.Les protéomes des micro-organismes avec lesquels nous souhaitons travailler proviennent de différentes bases de données et ont été séquencés et annotés par différentes équipes, via différentes méthodes. L'annotation fonctionnelle peut donc être de qualité hétérogène. C'est pourquoi il est nécessaire d'effectuer une ré-annotation fonctionnelle standardisée des protéomes des organismes que nous souhaitons comparer.L'annotation de séquences protéiques peut être réalisée par le transfert d'annotations entre séquences orthologues. Il existe plus de 39 bases de données répertoriant des orthologues prédits par différentes méthodes. Il est connu que ces méthodes mènent à des prédictions en partie différentes. Afin de tenir compte des prédictions actuelles tout en ajoutant de l'information pertinente, nous avons développé la méta-approche MARIO. Celle-ci combine les intersections des résultats de plusieurs méthodes de détections de groupes d'orthologues et les enrichit grâce à l'utilisation de profils HMM. Nous montrons que notre méta-approche permet de prédire un plus grand nombre d'orthologues tout en améliorant la similarité de fonction des paires d'orthologues prédites. Cela nous a permis de prédire le répertoire enzymatique de 178 protéomes de micro-organismes (dont 174 champignons).Dans un second temps, nous analysons ces répertoires enzymatiques afin d'en apprendre plus sur l'évolution du métabolisme. Dans ce but, nous cherchons des combinaisons de présence/absence d'activités enzymatiques permettant de caractériser un groupe taxonomique donné. Ainsi, il devient possible de déduire si la création d'un groupe taxonomique particulier peut s'expliquer par (ou a induit) l'apparition de certaines spécificités au niveau de son métabolisme.Pour cela, nous avons appliqué des méthodes d'apprentissage supervisé interprétables (règles et arbres de décision) sur les profils enzymatiques. Nous utilisons comme attributs les activités enzymatiques, comme classe les groupes taxonomiques et comme exemples les champignons. Les résultats obtenus, cohérents avec nos connaissances actuelles sur ces organismes, montrent que l'application de méthodes d'apprentissage supervisé est efficace pour extraire de l'information des profils phylogénétiques. Le métabolisme conserve donc des traces de l'évolution des espèces.De plus, cette approche, dans le cas de prédiction de classifieurs présentant un faible nombre d'erreurs, peut permettre de mettre en évidence l'existence de probables transferts horizontaux. C'est le cas par exemple du transfert du gène codant pour l'EC:3.1.6.6 d'un ancêtre des pezizomycotina vers un ancêtre d'Ustilago maydis. / This thesis has for objective to propose new methods allowing the study of the evolution of the metabolism. For that purpose, we chose to deal with the problem of comparison of the metabolism of hundred microorganisms.To compare the metabolism of various species, it is necessary to know at first the metabolism of each of these species.We work with proteomes of the microorganisms coming from various databases and sequenced and annotated by various teams, via various methods. The functional annotation can thus be of heterogeneous quality. That is why it is necessary to make a standardized functional annotation of this proteomes.The annotation of protein sequences can be realized by the transfer of annotations between orthologs sequences. There are more than 39 databases listing orthologues predicted by various methods. It is known that these methods lead to partially different predictions. To take into account current predictions and also adding relevant information, we developed the meta approach MARIO. This one combines the intersections of the results of several methods of detection of groups of orthologs and add sequences to this groups by using HMM profiles. We show that our meta approach allows to predict a largest number of orthologs while improving the similarity of function of the pairs of predicted orthologs. It allowed us to predict the enzymatic directory of 178 proteomes of microorganisms (among which 174 fungi).Secondly, we analyze these enzymatic directories in order to analyse the evolution of the metabolism. In this purpose, we look for combinations of presence / absence of enzymatic activities allowing to characterize a taxonomic group. So, it becomes possible to deduct if the creation of a particular taxonomic group can give some explanation by (or led to) the appearance of specificities at the level of its metabolism.For that purpose, we applied interpretable machine learning methods (rulers and decision trees) to the enzymatic profiles. We use as attributes the enzymatic activities, as classes the taxonomic groups and as examples the fungi. The results, coherent with our current knowledge on these species, show that the application of methods of machine learning is effective to extract informations of the phylogenetic profiles. The metabolism thus keeps tracks of the evolution of the species.Furthermore, this approach, in the case of prediction of classifiers presenting a low number of errors, can allow to highlight the existence of likely horizontal transfers. It is the case for example of the transfer of the gene coding for the EC:3.1.6.6 of an ancestor of pezizomycotina towards an ancestor of Ustilago maydis.
|
54 |
Cloning and characterization of the human coronavirus NL63 nucleocapsid proteinBerry, Michael January 2011 (has links)
<p>The human coronavirus NL63 was discovered in 2004 by a team of researchers in Amsterdam. Since its discovery it has been shown to have worldwide spread and affects mainly children, aged 0-5 years old, the immunocompromised and the elderly. Infection with HCoV-NL63 commonly results in mild upper respiratory tract infections and presents as the common cold, with symptoms including fever, cough, sore throat and rhinorrhoea. Lower respiratory tract findings are less common but may develop into more serious complications including bronchiolitis, pneumonia and croup. The primary function of the HCoV-NL63 nucleocapsid (N) protein is the formation of theprotective ribonucleocapsid core. For this particle to assemble, the N-protein undergoes N-N dimerization and then interacts with viral RNA. Besides the primary structural role of the Nprotein, it is also understood to be involved in viral RNA transcription, translation and replication, including several other physiological functions. The N-protein is also highly antigenic and elicits a strong immune response in infected patients. For this reason the N-protein may serve as a target for the development of diagnostic assays. We have used bioinformatic analysis to analyze the HCoV-NL63 N-protein and compared it to coronavirus N-homologues. This bioinformatic analysis provided the data to generate recombinant clones for expression in a bacterial system. We constructed recombinant clones of the N-protein of SARS-CoV and HCoV-NL63 and synthesized truncated clones corresponding to the N- and C-terminal of the HCoV-NL63 N-protein. These heterologously expressed proteins will serve the basis for several post-expression studies including characterizing the immunogenic epitope of the N-protein as well identifying any antibody crossreactivity between coronavirus species.</p>
|
55 |
Cloning and characterization of the human coronavirus NL63 nucleocapsid proteinBerry, Michael January 2011 (has links)
<p>The human coronavirus NL63 was discovered in 2004 by a team of researchers in Amsterdam. Since its discovery it has been shown to have worldwide spread and affects mainly children, aged 0-5 years old, the immunocompromised and the elderly. Infection with HCoV-NL63 commonly results in mild upper respiratory tract infections and presents as the common cold, with symptoms including fever, cough, sore throat and rhinorrhoea. Lower respiratory tract findings are less common but may develop into more serious complications including bronchiolitis, pneumonia and croup. The primary function of the HCoV-NL63 nucleocapsid (N) protein is the formation of theprotective ribonucleocapsid core. For this particle to assemble, the N-protein undergoes N-N dimerization and then interacts with viral RNA. Besides the primary structural role of the Nprotein, it is also understood to be involved in viral RNA transcription, translation and replication, including several other physiological functions. The N-protein is also highly antigenic and elicits a strong immune response in infected patients. For this reason the N-protein may serve as a target for the development of diagnostic assays. We have used bioinformatic analysis to analyze the HCoV-NL63 N-protein and compared it to coronavirus N-homologues. This bioinformatic analysis provided the data to generate recombinant clones for expression in a bacterial system. We constructed recombinant clones of the N-protein of SARS-CoV and HCoV-NL63 and synthesized truncated clones corresponding to the N- and C-terminal of the HCoV-NL63 N-protein. These heterologously expressed proteins will serve the basis for several post-expression studies including characterizing the immunogenic epitope of the N-protein as well identifying any antibody crossreactivity between coronavirus species.</p>
|
56 |
Predição da função das proteínas sem alinhamentos usando máquinas de vetor de suporte. / Protein function prediction without alignments by using support vector machines.Dias, Ulisses Martins 26 March 2007 (has links)
This thesis presents a new model to protein function prediction using support
vector machines, a machine learning approach trained using structural
parameters calculated from protein tertiary structure. The model is different
from the others paradigms because it is not necessary to search for similarities
against the others known proteins in public databases by alignments. In this
way, the model is able to associate functional relationships among proteins
with no similarities and it could be used when all other methods fail or when
the user don t want to use the concept of similarity in function predictions.
The proof that the model is valid was accomplished analyzing its performance
with unknown proteins, i.e proteins not used in the training set. The validation
approach used a set of binding proteins. / Fundação de Amparo a Pesquisa do Estado de Alagoas / Este trabalho apresenta um novo modelo capaz de prever a função de proteínas
utilizando máquinas de vetor de suporte, um método de aprendizagem
de máquina treinado usando parâmetros estruturais calculados a partir da
conformação espacial da própria proteína. O modelo difere do paradigma comum
de predição por não ser necessário calcular similaridades por meio de
alinhamentos entre a proteína que se deseja prever a função e as proteínas
de função conhecida presentes nos bancos de dados públicos. Dessa forma,
o modelo é capaz de associar função às proteínas que não possuem qualquer
semelhança com proteínas conhecidas, podendo ser usado quando todos os
outros métodos falham ou quando não se deseja utilizar o conceito de similaridade
na predição da função. A justificativa de que o modelo é válido foi
realizada analisando sua performance ao prever funções de proteínas desconhecidas,
proteínas não usadas no treinamento, utilizando como estudo de
caso um conjunto de proteínas de ligação.
|
57 |
Aplica??o de t?cnicas de aprendizado de m?quina no reconhecimento de classes estruturais de prote?nasBittencourt, Valnaide Gomes 25 November 2005 (has links)
Made available in DSpace on 2014-12-17T14:56:03Z (GMT). No. of bitstreams: 1
ValnaideGB.pdf: 1369975 bytes, checksum: 404710d72240200cbd30a9116933d340 (MD5)
Previous issue date: 2005-11-25 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated / Atualmente, a classifica??o estrutural de prote?nas, que diz respeito ? infer?ncia de padr?es em sua conforma??o 3D, ? um dos principais problemas em aberto da Biologia Molecular. Esse problema vem recebendo a aten??o de muitos pesquisadores na ?rea de Bioinform?tica pelo fato de as fun??es das prote?nas estarem intrinsecamente relacionadas ?s suas diferentes conforma??es espaciais, que s?o de dif?cil obten??o experimental em laborat?rio. Considerando a grande diferen?a entre o n?mero de seq??ncias de prote?nas conhecidas e o n?mero de estruturas tridimensionais determinadas experimentalmente, ? alta a demanda por t?cnicas automatizadas de classifica??o estrutural de prote?nas. Nesse contexto, as ferramentas computacionais, principalmente as t?cnicas de Aprendizado de M?quina (AM), tornaram-se alternativas essenciais para tratar esse problema. Neste trabalho, t?cnicas de AM s?o empregadas no reconhecimento de classes estruturais de prote?nas: ?rvore de Decis?o, k-Vizinhos Mais Pr?ximos, Na?ve Bayes, M?quinas de Vetores Suporte e Redes Neurais Artificiais. Esses m?todos foram escolhidos por representarem diferentes paradigmas de aprendizado e serem bastante citados na literatura. Visando conseguir uma melhoria de desempenho na solu??o do problema abordado, sistemas de multiclassifica??o homog?nea (Bagging e Boosting) e heterog?nea (Voting, Stacking e StackingC) s?o aplicados nesta pesquisa, usando como base as t?cnicas de AM anteriormente mencionadas. Al?m disso, pelo fato de a base de dados de prote?nas considerada neste trabalho apresentar o problema de classes desbalanceadas, t?cnicas artificiais de balanceamento de classes (Under-sampling Aleat?rio, Tomek Links, CNN, NCL e OSS) s?o utilizadas a fim de minimizar esse problema e melhorar o desempenho dos classificadores. Para a avalia??o dos m?todos de AM, um procedimento de valida??o cruzada ? empregado, em que a acur?cia dos classificadores ? medida atrav?s das m?dias da taxa de classifica??o incorreta nos conjuntos de testes independentes. Essas m?dias s?o comparadas duas a duas pelo teste de hip?tese a fim de avaliar se h? diferen?a estatisticamente significativa entre elas. Com os resultados obtidos, pode-se observar, entre os classificadores base, o desempenho superior do m?todo M?quinas de Vetores Suporte. Os sistemas de multiclassifica??o (homog?nea e heterog?nea), por sua vez, apresentaram, em geral, uma acur?cia superior ou similar a dos classificadores usados como base, destacando-se o Boosting que usou ?rvore de Decis?o em sua forma??o e o StackingC tendo como meta classificador a Regress?o Linear. O m?todo Voting, apesar de sua simplicidade, tamb?m mostrou-se adequado para a solu??o do problema considerado nesta disserta??o. Em rela??o ?s t?cnicas de balanceamento de classes, n?o foram alcan?ados melhores resultados de classifica??o global com as bases de dados obtidas com a aplica??o de tais t?cnicas. No entanto, foi poss?vel uma melhor classifica??o espec?fica da classe minorit?ria, de dif?cil aprendizado. A t?cnica NCL foi a que se mostrou mais apropriada ao balanceamento de classes da base de dados de prote?nas
|
58 |
"Aplicação da bioinformática no estudo dos genes e enzimas envolvidos na síntese da goma fastidiana produzida pela xylella fastidiosa" / "Bioinformatic applied in studies of the enzymes involved in the biosynthesis of the exopolysaccharide, fastidian gum, produced by Xylella Fastidiosa"João Renato Carvalho Muniz 25 April 2003 (has links)
Xylella fastidiosa é uma bactéria Gram-negativa, limitada ao xilema das plantas e o agente causador de diversas doenças em importantes plantações como citros, videiras, mirta, amêndoa, arbustos e café. Em citros, X. fastidiosa causa a Clorose Variegada dos Citros (CVC) ou amarelinho. Nove enzimas (GumB, C, D, E, F, H, J, K e M) estão envolvidas nas etapas biossintéticas de um polissacarídeo extracelular (EPS), chamado de goma fastidiana, um dos mecanismos envolvidos na patogênese da bactéria. Essas enzimas catalisam reações de adição de açúcares, polimerização e exportação do EPS através da membrana da bactéria. No presente trabalho, ferramentas de bioinformática foram utilizadas para o estudo e entendimento da biossíntese da goma fastidiana. As nove enzimas foram estudadas quanto ao seu conteúdo de estrutura secundária, análise de hidrofobicidade e das regiões transmembrânicas, classificação quanto as suas funções. A construção de modelos estruturais para as enzimas Gums através de comparação por homologia seqüencial mostrou ser um processo impossível, devido a falta de moléculas homólogas com estruturas tridimensionais conhecidas. Por outro lado, métodos de reconhecimento de enovelamento mostraram bons resultados e comparações entre as estruturas secundárias das enzimas Gums foram calculadas com a utilização dos programas GenThreader e THREADER 3.3. Modelos tridimensionais para as enzimas GumB, GumK, GumM, GumJ e GumC foram construídos com o programa MODELLER 6.0a e validados com o programa Procheck e VERIFY 3D. Para construção do modelo da GumH (enzima que catalisa a adição da GDP-manose em um lipídio carreador poliprenol), o GenThreader encontrou similaridades quando comparada a MurG (E.coli), 2-epimerase (E. coli), GtfB (Amycolatopsis orientalis) e beta-GT de fago T4. Todos os modelos são bastante semelhantes e compostos por dois domínios (alfa/beta), ambos similares ao motivo de ligação de nucleotídeos Rossmann fold e separados por uma fenda profunda, que, provavelmente, forma o sítio de ligação da GDP-manose. Estudos da interação entre proteína e substrato foram obtidos com a utilização do programa FLO. O alinhamento seqüencial da GumH com outras onze glicosiltransferases mostrou regiões bastante conservadas, incluindo o motivo EX7E presente no sítio de ligação do substrato na proteína. Considerações a respeito das interações do substrato GDP-manose com a enzima GumH e do mecanismo da reação foram feitas. Essas análises enfatizam o modelo obtido para a GumH, que representa a primeira estrutura proposta para as enzimas envolvidas na síntese da goma fastidiana. / Xylella fastidiosa is a xylem-dwelling, insect-transmitted gamma-protobacterium that causes pathogenicity in citrus plants and many others important crops such as grapevine, periwinkle, almond, oleander and coffee. In citrus plants, X. fastidiosa causes citrus variegated chlorosis (CVC) or amarelinho. Nine enzymes (GumB, C, D, E, F, H, J, K and M) are involved in the biosynthetic pathway of an exopolysaccharide (EPS) called fastidian gum which could be involved in the pathogenicity of the bacterium. These enzymes catalyses sugars addition reactions, polymerization and discharge of the EPS through the bacterias membrane. We have used bioinformatic tools to study these enzymes and to understand the gum biosynthesis. The nine enzymes were studied regarding to its secondary structure content, analysis of hidrophobicity and transmembrane regions, and yet function classification. The construction of structural models using sequential homology was shown to be impossible, due to the necessity of homologues molecules whose three-dimensional structures are known. On the other hand, pairwises comparisons of secondary structures showed good results and were realized with GenThreader and THREADER 3.3 programs. Three-dimensional structures to GumB, GumK, GumM, GumJ and GumC enzymes were constructed using MODELLER 6.0a and validated with Procheck and VERIFY 3D programs. To construct the model of GumH (enzyme that catalyse the addiction of a GDP-mannose on a polyprenol phosphate carrier), GenThreader found folding similarities when compared to MurG and UDP-Acetylglucosamine 2-Epimerase (from E. coli), GtfB (from Amycolatopsis orientalis) and beta-GT (from T4 phage). The models are very similar consisting of two alpha/beta open sheet domains, both alike in topology to the Rossmann nucleotide-binding folds, and separated by a deep cleft which probably forms the GDP-mannose binding site. Studies of the interaction between enzyme and docked substrate were carried out using the FLO program. The sequence alignment between GumH and another eleven glycosiltransferases showed several preserved regions including the EX7E motif present on the substrate binding site. The interactions between enzyme-GDP-mannose substrate and the mechanism of the reaction were studied. These analyses emphasize the three-dimensional model constructed for GumH that represents the first structural information for enzymes involved in fastidian gum synthesis.
|
59 |
Expression studies of human coronavirus nl63- nucleocapsid, membrane and envelope proteinsManasse, Taryn-lee January 2013 (has links)
>Magister Scientiae - MSc / Acute respiratory infections (ARI) continue to be the leading cause of acute illnesses
worldwide and remain the most important cause of infant and young children mortality. Many viruses such as rhinoviruses, influenza viruses, parainfluenza viruses, respiratory syncytial viruses, adenoviruses and coronaviruses are deemed to be the etiological agents responsible for ARI’s in children. The recently discovered coronaviruses HCoV-HKU1 and HCoV-NL63 contribute significantly to the
hospitalization of children with ARI’s. HCoV-NL63 was first identified in 2004, as the pathogen responsible for the hospitalization of a 7 month old child presenting with coryza, conjunctivitis and fever. Since then a significant amount of knowledge has been gained in the clinical spectrum on this virus, however HCoV-NL63 is still not well characterized on the molecular and proteomic level. This dissertation focuses on bringing about this characterization by cloning the HCoV-NL63 Nucleocapsid gene to be expressed in a bacterial system and transfecting the Nucleocapsid, Membrane and Envelope genes into a Mammalian cell culture system in order for its respective proteins to be expressed. With the use of Bioinformatic analytic tools certain characteristics of HCoV-NL63 Nucleocapsid, Membrane and Envelope proteins are able to be identified, as well as certain motifs and/or regions that are important in the functioning of these proteins. By comparing the results obtained for HCoV-NL63 N,M and E to other well studied coronavirus homologous will enlighten us on the potential role(s) of these proteins in determining HCoV-NL63 pathogenicity and infectivity. vi Although certain functions of these proteins can be deduced by the means of bioinformatics analysis, it is still imperative for it to be extensively characterized In Vitro. This will therefore form a fundamental step in the development of many other projects, which unfortunately fall outside the scope of this M.Sc thesis.
|
60 |
Investigating cancer aetiology through the analysis of somatic mutation signatures / Analyse des empreintes mutationnelles pour la recherche sur l'étiologie des cancers humainsArdin, Maude 30 November 2016 (has links)
Les cellules cancéreuses sont caractérisées par des altérations de l'ADN causées par des facteurs exogènes, comme l'exposition à des agents environnementaux tels que le tabac ou les UV, ou par des mécanismes endogènes tels que les erreurs de polymérase lors de la réplication de l'ADN. L'analyse des causes et des conséquences de ces altérations permet de mieux comprendre les facteurs et mécanismes à l'origine du développement d'un cancer. Les technologies de séquençages à haut débit offrent l'opportunité d'étudier la nature précise de ces altérations à l'échelle du génome et permettent de révéler des signatures mutationnelles distinctes et spécifiques de cancérigènes, fournissant ainsi des hypothèses sur l'étiologie des cancers.L'objectif de ma thèse a consisté à développer des méthodes et des outils bioinformatiques accessibles et conviviaux permettant de faciliter l'analyse et l'interprétation des signatures mutationnelles à partir de données de séquençage à haut débit. L'application de ces outils et méthodes à des séries originales de tumeurs humaines et de systèmes expérimentaux de mutagénèse et carcinogénèse a permis de mieux caractériser la signature mutationnelle de l'acide aristolochique (AA) ainsi que d'autres cancérigènes d'intérêt / Cellular genomes accumulate alterations following exposures to exogenous factors, like environmental agents such as tobacco smoking or UV, or to endogenous mechanisms such as DNA replication errors. Analysing the causes and consequences of these changes allows a better understanding of the mechanisms underlying cancer development and progression. Next-generation sequencing (NGS) technologies provide the opportunity tostudy the nature of the resulting alterations on a genome-wide scale and started to reveal distinct mutational signatures specific to past carcinogenic exposures providing clues on cancer aetiology.The aim of my thesis was to develop user-friendly bioinformatic tools and methods for facilitating the analysis and interpretation of carcinogen-specific mutational signatures from NGS data. Applying these tools and methods to human tumours and experimental models of mutagenesis led to a better characterisation the mutational signature of aristolochic acid (AA), as well as other carcinogens of interest
|
Page generated in 0.0786 seconds