Made available in DSpace on 2014-12-17T14:56:02Z (GMT). No. of bitstreams: 1
MeikaIM.pdf: 766418 bytes, checksum: 87a604688aa5cd2c4f6aba8237c67210 (MD5)
Previous issue date: 2005-12-13 / One of the most important goals of bioinformatics is the ability to identify genes in uncharacterized DNA sequences on world wide database. Gene expression on prokaryotes initiates when the RNA-polymerase enzyme interacts with DNA regions called promoters. In these regions are located the main regulatory elements of the transcription process. Despite the improvement of in vitro techniques for molecular biology analysis, characterizing and identifying a great number of promoters on a genome is a complex task. Nevertheless, the main drawback is the absence of a large set of promoters to identify conserved patterns among the species. Hence, a in silico method to predict them on any species is a challenge. Improved promoter prediction methods can be one step towards developing more reliable ab initio gene prediction methods. In this work, we present an empirical comparison of Machine Learning (ML) techniques such as Na??ve Bayes, Decision Trees, Support Vector Machines and Neural Networks, Voted Perceptron, PART, k-NN and and ensemble approaches (Bagging and Boosting) to the task of predicting Bacillus subtilis. In order to do so, we first built two data set of promoter and nonpromoter sequences for B. subtilis and a hybrid one. In order to evaluate of ML methods a cross-validation procedure is applied. Good results were obtained with methods of ML like SVM and Na?ve Bayes using B. subtilis. However, we have not reached good results on hybrid database / Um dos grandes desafios da Bioinform?tica ? manipular e analisar os dados acumulados nas bases de dados mundiais. A express?o dos genes em procariotos ? iniciada quando a enzima RNA polimerase une-se com uma regi?o pr?xima ao gene, chamada de promotor, onde ? localizado os principais elementos regulat?rios do processo de transcri??o. Apesar do crescente avan?o das t?cnicas experimentais (in vitro) em biologia molecular, caracterizar e identificar um n?mero significante de promotores ainda ? uma tarefa dif?cil. Os m?todos computacionais existentes enfrentam a falta de um n?mero adequado de promotores conhecidos para identificar padr?es conservados entre as esp?cies. Logo, um m?todo para prediz?-los em qualquer organismo procari?tico ainda ? um desafio. Neste trabalho, apresentamos uma compara??o emp?rica de t?cnicas individuais de aprendizado de m?quina, tais como: Classificador Bayesiano Ing?nuo, ?rvores de Decis?o, M?quinas de Vetores de Suporte, Redes Neurais do tipo VotedPerceptron, PART e k-Vizinhos Mais Pr?ximos e sistemas multiclassificadores (Bagging e Adaboosting) e Modelo Oculto de Markov ? tarefa de predi??o de promotores procariotos em Bacilos subtilis. Utilizamos a valida??o cruzada para avaliar todos os m?todos de AM. Para esses testes, foram constru?das base de dados com seq??ncias de promotores e n?o-promotores do Bacillus subtilis e uma base de dados h?brida. Os m?todos de AM obtiveram bons resultados com o SVM e o Na?ve Bayes. N?o conseguimos entretanto, obter resultados relevantes para a base de dados h?brida
Identifer | oai:union.ndltd.org:IBICT/oai:repositorio.ufrn.br:123456789/15416 |
Date | 13 December 2005 |
Creators | Monteiro, Meika Iwata |
Contributors | CPF:32541457120, http://lattes.cnpq.br/1562357566810393, Oliveira, Jauvane Cavalcante de, CPF:46168834330, http://lattes.cnpq.br/4054756781423727, D?ria Neto, Adri?o Duarte, CPF:10749896434, http://lattes.cnpq.br/1987295209521433, Souto, Marc?lio Carlos Pereira de, Gon?alves, Luiz Marcos Garcia |
Publisher | Universidade Federal do Rio Grande do Norte, Programa de P?s-Gradua??o em Engenharia El?trica, UFRN, BR, Automa??o e Sistemas; Engenharia de Computa??o; Telecomunica??es |
Source Sets | IBICT Brazilian ETDs |
Language | Portuguese |
Detected Language | English |
Type | info:eu-repo/semantics/publishedVersion, info:eu-repo/semantics/masterThesis |
Format | application/pdf |
Source | reponame:Repositório Institucional da UFRN, instname:Universidade Federal do Rio Grande do Norte, instacron:UFRN |
Rights | info:eu-repo/semantics/openAccess |
Page generated in 0.0028 seconds