Global ETD Search

41	Studies on support vector machines and applications to video object extraction Liu, Yi 22 September 2006 (has links) No description available. machine learning support vector machines video object extraction.
42	A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis Using Time Series Gene Expression Data Ni, Ying 08 July 2016 (has links) Gene regulatory networks (GRNs) provide a natural representation of relationships between regulators and target genes. Though inferring GRN is a challenging task, many methods, including unsupervised and supervised approaches, have been developed in the literature. However, most of these methods target non-context-specific GRNs. Because the regulatory relationships consistently reprogram under different tissues or biological processes, non-context-specific GRNs may not fit some specific conditions. In addition, a detailed investigation of the prediction results has remained elusive. In this study, I propose to use a machine learning approach to predict GRNs that occur in developmental stage-specific networks and to show how it improves our understanding of the GRN in seed development. I developed a Beacon GRN inference tool to predict a GRN in seed development in Arabidopsis based on a support vector machine (SVM) local model. Using the time series gene expression levels in seed development and prior known regulatory relationships, I evaluated and predicted the GRN at this specific biological process. The prediction results show that one gene may be controlled by multiple regulators. The targets that are strongly positively correlated with their regulators are mostly expressed at the beginning of seed development. The direct targets were detected when I found a match between the promoter regions of the targets and the regulator's binding sequence. Our prediction provides a novel testable hypotheses of a GRN in seed development in Arabidopsis, and the Beacon GRN inference tool provides a valuable model system for context-specific GRN inference. / Master of Science Network inference signal transduction pathways gene expression support vector machines
43	Modelagem da produtividade da cultura da cana de açúcar por meio do uso de técnicas de mineração de dados / Modeling sugarcane yield through Data Mining techniques Hammer, Ralph Guenther 27 July 2016 (has links) O entendimento da hierarquia de importância dos fatores que influenciam a produtividade da cana de açúcar pode auxiliar na sua modelagem, contribuindo assim para a otimização do planejamento agrícola das unidades produtoras do setor, bem como no aprimoramento das estimativas de safra. Os objetivos do presente estudo foram a ordenação das variáveis que condicionam a produtividade da cana de açúcar, de acordo com a sua importância, bem como o desenvolvimento de modelos matemáticos de produtividade da cana de açúcar. Para tanto, foram utilizadas três técnicas de mineração de dados nas análises de bancos de dados de usinas de cana de açúcar no estado de São Paulo. Variáveis meteorológicas e de manejo agrícola foram submetidas às análises por meio das técnicas Random Forest, Boosting e Support Vector Machines, e os modelos resultantes foram testados por meio da comparação com dados independentes, utilizando-se o coeficiente de correlação (r), índice de Willmott (d), índice de confiança de Camargo (C), erro absoluto médio (EAM) e raíz quadrada do erro médio (RMSE). Por fim, comparou-se o desempenho dos modelos gerados com as técnicas de mineração de dados com um modelo agrometeorológico, aplicado para os mesmos bancos de dados. Constatou-se que, das variáveis analisadas, o número de cortes foi o fator mais importante em todas as técnicas de mineração de dados. A comparação entre as produtividades estimadas pelos modelos de mineração de dados e as produtividades observadas resultaram em RMSE variando de 19,70 a 20,03 t ha-1 na abordagem mais geral, que engloba todas as regiões do banco de dados. Com isso, o desempenho preditivo foi superior ao modelo agrometeorológico, aplicado no mesmo banco de dados, que obteve RMSE ≈ 70% maior (≈ 34 t ha-1). / The understanding of the hierarchy of the importance of the factors which influence sugarcane yield can subsidize its modeling, thus contributing to the optimization of agricultural planning and crop yield estimates. The objectives of this study were to ordinate the variables which condition the sugarcane yield, according to their relative importance, as well as the development of mathematical models for predicting sugarcane yield. For this, three Data Mining techniques were applied in the analyses of data bases of several sugar mills in the State of São Paulo, Brazil. Meteorological and crop management variables were analyzed through the Data Mining techniques Random Forest, Boosting and Support Vector Machines, and the resulting models were tested through the comparison with an independent data set, using the coefficient of correlation (r), Willmott index (d), confidence index of Camargo (c), mean absolute error (MAE), and root mean square error (RMSE). Finally, the predictive performances of these models were compared with the performance of an agrometeorological model, applied in the same data set. The results allowed to conclude that, within all the variables, the number of cuts was the most important factor considered by all Data Mining models. The comparison between the observed yields and those estimated by the Data Mining techniques resulted in a RMSE ranging between 19,70 to 20,03 t ha-1, in the general method, which considered all regions of the data base. Thus, the predictive performances of the Data Mining algorithms were superior to that of the agrometeorological model, which presented RMSE ≈ 70% higher (≈ 34 t ha-1). Agricultural planning Boosting Boosting Planejamento agrícola Predição Prediction Random forest Random forest Support vector machines Support vector machines
44	Modelagem da produtividade da cultura da cana de açúcar por meio do uso de técnicas de mineração de dados / Modeling sugarcane yield through Data Mining techniques Ralph Guenther Hammer 27 July 2016 (has links) O entendimento da hierarquia de importância dos fatores que influenciam a produtividade da cana de açúcar pode auxiliar na sua modelagem, contribuindo assim para a otimização do planejamento agrícola das unidades produtoras do setor, bem como no aprimoramento das estimativas de safra. Os objetivos do presente estudo foram a ordenação das variáveis que condicionam a produtividade da cana de açúcar, de acordo com a sua importância, bem como o desenvolvimento de modelos matemáticos de produtividade da cana de açúcar. Para tanto, foram utilizadas três técnicas de mineração de dados nas análises de bancos de dados de usinas de cana de açúcar no estado de São Paulo. Variáveis meteorológicas e de manejo agrícola foram submetidas às análises por meio das técnicas Random Forest, Boosting e Support Vector Machines, e os modelos resultantes foram testados por meio da comparação com dados independentes, utilizando-se o coeficiente de correlação (r), índice de Willmott (d), índice de confiança de Camargo (C), erro absoluto médio (EAM) e raíz quadrada do erro médio (RMSE). Por fim, comparou-se o desempenho dos modelos gerados com as técnicas de mineração de dados com um modelo agrometeorológico, aplicado para os mesmos bancos de dados. Constatou-se que, das variáveis analisadas, o número de cortes foi o fator mais importante em todas as técnicas de mineração de dados. A comparação entre as produtividades estimadas pelos modelos de mineração de dados e as produtividades observadas resultaram em RMSE variando de 19,70 a 20,03 t ha-1 na abordagem mais geral, que engloba todas as regiões do banco de dados. Com isso, o desempenho preditivo foi superior ao modelo agrometeorológico, aplicado no mesmo banco de dados, que obteve RMSE ≈ 70% maior (≈ 34 t ha-1). / The understanding of the hierarchy of the importance of the factors which influence sugarcane yield can subsidize its modeling, thus contributing to the optimization of agricultural planning and crop yield estimates. The objectives of this study were to ordinate the variables which condition the sugarcane yield, according to their relative importance, as well as the development of mathematical models for predicting sugarcane yield. For this, three Data Mining techniques were applied in the analyses of data bases of several sugar mills in the State of São Paulo, Brazil. Meteorological and crop management variables were analyzed through the Data Mining techniques Random Forest, Boosting and Support Vector Machines, and the resulting models were tested through the comparison with an independent data set, using the coefficient of correlation (r), Willmott index (d), confidence index of Camargo (c), mean absolute error (MAE), and root mean square error (RMSE). Finally, the predictive performances of these models were compared with the performance of an agrometeorological model, applied in the same data set. The results allowed to conclude that, within all the variables, the number of cuts was the most important factor considered by all Data Mining models. The comparison between the observed yields and those estimated by the Data Mining techniques resulted in a RMSE ranging between 19,70 to 20,03 t ha-1, in the general method, which considered all regions of the data base. Thus, the predictive performances of the Data Mining algorithms were superior to that of the agrometeorological model, which presented RMSE ≈ 70% higher (≈ 34 t ha-1). Boosting Planejamento agrícola Predição Random forest Support vector machines Agricultural planning Boosting Prediction Random forest Support vector machines
45	Αναγνώριση γονιδιακών εκφράσεων νεοπλασιών σε microarrays / Identification of tumor gene expression from microarrays Τσακανίκας, Παναγιώτης 16 May 2007 (has links) Η ουσιώδης ανάπτυξη που παρουσίασε η μοριακή παθολογία τα τελευταία χρόνια, είναι συνυφασμένη με την ανάπτυξη της microarray τεχνολογίας. Αυτή η τεχνολογία μας παρέχει μια νέα οδό προσπέλασης υψηλής χωρητικότητας τέτοια ώστε: i. να δίνεται η δυνατότητα ανάλυσης μεγάλης κλίμακας της ισοτοπικής αφθονίας του αγγελιοφόρου RNA (mRNA), ως δείκτη γονιδιακών εκφράσεων (cDNA arrays), ii. να ανιχνεύονται πολυμορφισμοί ή μεταλλάξεις μέσα σε έναν πληθυσμό γονιδίων χρησιμοποιώντας ξεχωριστούς nucleotide πολυμορφισμούς (Single Nucleotide Polymorphisms arrays), iii. και για εξέταση «απώλειας» ή «κέρδους», ή αλλαγές στον αριθμό αντιγραφής κάποιου συγκεκριμένου γονιδίου που σχετίζεται με κάποια ασθένεια (CGH arrays). Η τεχνολογία των microarrays είναι ευλογοφανές να εξελιχθεί σε ακρογωνιαίο λίθο της μοριακής έρευνας στα επόμενα χρόνια, και αυτό γιατί DNA microarrays χρησιμοποιούνται για τον ποσοτικό προσδιορισμό δεκάδων χιλιάδων DNA ή RNA ακολουθιών σε μια και μόνο ανάλυση – πείραμα. Από μια σειρά από τέτοια πειράματα, είναι δυνατόν να προσδιορίσουμε τους μηχανισμούς που ελέγχουν την ενεργοποίηση των γονιδίων σε έναν οργανισμό. Ακόμη η χρήση των microarrays για την επισκόπηση γονιδιακών εκφράσεων είναι μια ραγδαία αναπτυσσόμενη τεχνολογία, η οποία μετακινήθηκε από εξειδικευμένα, σε συμβατικά βιολογικά εργαστήρια. Στην παρούσα διπλωματική εργασία κυρίως, θα αναφερθούμε στα δύο γενικά στάδια της ανάλυσης των microarray εικόνων που μας δίνονται ως απόρρεια των πειραμάτων, που έχουν ως στόχο την εξόρυξη πληροφορίας από αυτές. Τα δύο αυτά στάδια είναι: i. Επεξεργασία εικόνας και εξαγωγή πληροφορίας από αυτήν. ii. Ανάλυση της προκύπτουσας πληροφορίας και αναγνώριση των γονιδιακών εκφράσεων. Όσον αφορά το πρώτο στάδιο θα αναφέρουμε τις βασικές μεθόδους που χρησιμοποιούνται σήμερα από εμπορικά και εκπαιδευτικά πακέτα λογισμικού, οι οποίες μέθοδοι δίνουν και τα καλύτερα αποτελέσματα μέχρι στιγμής. Για το δεύτερο στάδιο, αφού αναφέρουμε τις πιο σημαντικές μεθόδους που χρησιμοποιούνται, θα υλοποιήσουμε μια δική μας μέθοδο και θα την συγκρίνουμε με τις υπάρχουσες. Microarrays Support vector machines Επεξεργασία εικόνας 621.367 Microarrays Support vector machines Image processing Tumor identification
46	"Investigação de estratégias para a geração de máquinas de vetores de suporte multiclasses" / Investigation of strategies for the generation of multiclass support vector machines Lorena, Ana Carolina 16 February 2006 (has links) Diversos problemas envolvem a classificação de dados em categorias, também denominadas classes. A partir de um conjunto de dados cujas classes são conhecidas, algoritmos de Aprendizado de Máquina (AM) podem ser utilizados na indução de um classificador capaz de predizer a classe de novos dados do mesmo domínio, realizando assim a discriminação desejada. Dentre as diversas técnicas de AM utilizadas em problemas de classificação, as Máquinas de Vetores de Suporte (Support Vector Machines - SVMs) se destacam por sua boa capacidade de generalização. Elas são originalmente concebidas para a solução de problemas com apenas duas classes, também denominados binários. Entretanto, diversos problemas requerem a discriminação dos dados em mais que duas categorias ou classes. Nesta Tese são investigadas e propostas estratégias para a generalização das SVMs para problemas com mais que duas classes, intitulados multiclasses. O foco deste trabalho é em estratégias que decompõem o problema multiclasses original em múltiplos subproblemas binários, cujas saídas são então combinadas na obtenção da classificação final. As estratégias propostas visam investigar a adaptação das decomposições a cada aplicação considerada, a partir de informações do desempenho obtido em sua solução ou extraídas de seus dados. Os algoritmos implementados foram avaliados em conjuntos de dados gerais e em aplicações reais da área de Bioinformática. Os resultados obtidos abrem várias possibilidades de pesquisas futuras. Entre os benefícios verificados tem-se a obtenção de decomposições mais simples, que requerem menos classificadores binários na solução multiclasses. / Several problems involve the classification of data into categories, also called classes. Given a dataset containing data whose classes are known, Machine Learning (ML) algorithms can be employed for the induction of a classifier able to predict the class of new data from the same domain, thus performing the desired discrimination. Among the several ML techniques applied to classification problems, the Support Vector Machines (SVMs) are known by their high generalization ability. They are originally conceived for the solution of problems with only two classes, also named binary problems. However, several problems require the discrimination of examples into more than two categories or classes. This thesis investigates and proposes strategies for the generalization of SVMs to problems with more than two classes, known as multiclass problems. The focus of this work is on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are then combined to obtain the final classification. The proposed strategies aim to investigate the adaptation of the decompositions for each multiclass application considered, using information of the performance obtained for its solution or extracted from its examples. The implemented algorithms were evaluated on general datasets and on real applications from the Bioinformatics domain. The results obtained open possibilities of many future work. Among the benefits observed is the obtainment of simpler decompositions, which require less binary classifiers in the multiclass solution. algoritmos genéticos árvores geradoras mínimas Bioinformática Bioinformatics genetic algorithms minimum spanning trees multiclass problems problemas multiclasses support vector machines
47	"Classificação de páginas na internet" / "Internet pages classification" Martins Júnior, José 11 April 2003 (has links) O grande crescimento da Internet ocorreu a partir da década de 1990 com o surgimento dos provedores comerciais de serviços, e resulta principalmente da boa aceitação e vasta disseminação do uso da Web. O grande problema que afeta a escalabilidade e o uso de tal serviço refere-se à organização e à classificação de seu conteúdo. Os engenhos de busca atuais possibilitam a localização de páginas na Web pela comparação léxica de conjuntos de palavras perante os conteúdos dos hipertextos. Tal mecanismo mostra-se ineficaz quando da necessidade pela localização de conteúdos que expressem conceitos ou objetos, a exemplo de produtos à venda oferecidos em sites de comércio eletrônico. A criação da Web Semântica foi anunciada no ano de 2000 para esse propósito, visando o estabelecimento de novos padrões para a representação formal de conteúdos nas páginas Web. Com sua implantação, cujo prazo inicialmente previsto foi de dez anos, será possível a expressão de conceitos nos conteúdos dos hipertextos, que representarão objetos classificados por uma ontologia, viabilizando assim o uso de sistemas, baseados em conhecimento, implementados por agentes inteligentes de software. O projeto DEEPSIA foi concebido como uma solução centrada no comprador, ao contrário dos atuais Market Places, para resolver o problema da localização de páginas Web com a descrição de produtos à venda, fazendo uso de métodos de classificação de textos, apoiados pelos algoritmos k-NN e C4.5, no suporte ao processo decisório realizado por um agente previsto em sua arquitetura, o Crawler Agent. Os testes com o sistema em sites brasileiros denotaram a necessidade pela sua adaptação em diversos aspectos, incluindo-se o processo decisório envolvido, que foi abordado pelo presente trabalho. A solução para o problema envolveu a aplicação e a avaliação do método Support Vector Machines, e é descrita em detalhes. / The huge growth of the Internet has been occurring since 90s with the arrival of the internet service providers. One important reason is the good acceptance and wide dissemination of the Web. The main problem that affects its scalability and usage is the organization and classification of its content. The current search engines make possible the localization of pages in the Web by means of a lexical comparison among sets of words and the hypertexts contents. In order to find contents that express concepts or object, such as products for sale in electronic commerce sites such mechanisms are inefficient. The proposition of the Semantic Web was announced in 2000 for this purpose, envisioning the establishment of new standards for formal contents representation in the Web pages. With its implementation, whose deadline was initially stated for ten years, it will be possible to express concepts in hypertexts contents, that will fully represent objects classified into an ontology, making possible the use of knowledge based systems implemented by intelligent softwares agents. The DEEPSIA project was conceived as a solution centered in the purchaser, instead of current Market Places, in order to solve the problem of finding Web pages with products for sale description, making use of methods of text classification, with k-NN and C4.5 algorithms, to support the decision problem to be solved by an specific agent designed, the Crawler Agent. The tests of the system in Brazilian sites have denoted the necessity for its adaptation in many aspects, including the involved decision process, which was focused in present work. The solution for the problem includes the application and evaluation of the Support Vector Machines method, and it is described in detail. agent agente Classificação de Textos comércio eletrônico DEEPSIA DEEPSIA electronic commerce ontologia ontology Support Vector Machines Support Vector Machines text classification Web Web
48	Monitoramento da cobertura do solo no entorno de hidrelétricas utilizando o classificador SVM (Support Vector Machines). / Land cover monitoring in hydroelectric domain area using Support Vector Machines (SVM) classifier. Albuquerque, Rafael Walter de 07 December 2011 (has links) A classificação de imagens de satélite é muito utilizada para elaborar mapas de cobertura do solo. O objetivo principal deste trabalho consistiu no mapeamento automático da cobertura do solo no entorno da Usina de Lajeado (TO) utilizando-se o classificador SVM. Buscou-se avaliar a dimensão de áreas antropizadas presentes na represa e a acurácia da classificação gerada pelo algoritmo, que foi comparada com a acurácia da classificação obtida pelo tradicional classificador MAXVER. Esta dissertação apresentou sugestões de calibração do algoritmo SVM para a otimização do seu resultado. Verificou-se uma alta acurácia na classificação SVM, que mostrou o entorno da represa hidrelétrica em uma situação ambientalmente favorável. Os resultados obtidos pela classificação SVM foram similares aos obtidos pelo MAXVER, porém este último contextualizou espacialmente as classes de cobertura do solo com uma acurácia considerada um pouco menor. Apesar do bom estado de preservação ambiental apresentado, a represa deve ter seu entorno devidamente monitorado, pois foi diagnosticada uma grande quantidade de incêndios gerados pela população local, sendo que as ferramentas discutidas nesta dissertação auxiliam esta atividade de monitoramento. / Satellite Image Classification are very useful for building land cover maps. The aim of this study consists on an automatic land cover mapping in the domain area of Lajeados dam, at Tocantins state, using the SVM classifier. The aim of this work was to evaluate anthropic dimension areas near the dam and also to verify the algorithms classification accuracy, which was compared to the results of the standard ML (Maximum Likelihood) classifier. This work presents calibration suggestions to the SVM algorithm for optimizing its results. SVM classification presented high accuracy, suggesting a good environmental situation along Lajeados dam region. Classification results comparison between SVM and ML were quite similar, but SVMs spatial contextual mapping areas were slightly better. Although environmental situation of the study area was considered good, monitoring ecosystem is important because a significant quantity of burnt areas was noticed due to local communities activities. This fact emphasized the importance of the tools discussed in this work, which helps environmental monitoring. Classificação Classification Dam Hidrelétrica Hydroeletric Imagem de satélite Remote sensing Represa Satellite images Sensoriamento remoto SVM (Support Vector Machines) SVM (Support Vector Machines)
49	Effets masqués en analyse prédictive / Masked effects in predictive analysis Bascoul, Ganaël 27 June 2013 (has links) L’objectif de cette thèse consiste en l’élaboration de deux méthodologies visant à révéler des effets jusqu’alors masqués en modélisation décisionnelle. Dans la première partie, nous cherchons à mettre en œuvre une méthode d’analyse locale des critères de choix dans un contexte de choix binaires. Dans une seconde partie, nous mettons en avant les effets de génération dans l’étude des comportements de choix. Dans les deux parties, notre démarche de recherche combine de nouveaux outils d’analyse prédictive (Support Vector Machines, FANOVA, PLS) aux outils traditionnels de statistique inférentielle, afin d’enrichir les résultats habituels par des informations complémentaires sur les effets masqués que constituent les effets locaux dans les fonctions de choix binaires, et les effets de génération dans l’analyse temporelle des comportement de choix. Les méthodologies proposées, respectivement nommées AEL et APC-PLS, sont appliquées sur des cas réels, afin d’en illustrer le fonctionnement et la pertinence. / The objective of this thesis is the development of two methodologies to reveal previously hidden effects in decision modeling. In the first part, we try to implement a method of local analysis in order to select criteria in the context of binary choices. In a second part, we highlight the effects of generations in the study of consumer behavior. In both parts, our research approach combines new predictive analytical tools (such as Support Vector Machines, FANOVA, PLS) to traditional tools of inferential statistics, to enrich the usual results by additional on the masked effects, which are the local effects in the binary choice functions, and the effects of generation in temporal choice behavior analysis.The proposed methodologies, respectively named AEL and APC- PLS are both applied to real cases in order to illustrate their operation and relevance. Effets d'âge Effets de cohorte Effets de marge Support vector machines FANOVA PLS Age effects Cohort effect Marginal effect Support vector machines FANOVA PLS 658.8 519
50	Wavelets, predição linear e LS-SVM aplicados na análise e classificação de sinais de vozes patológicas / Wavelets, LPC and LS-SVM applied for analysis and identification of pathological voice signals Fonseca, Everthon Silva 24 April 2008 (has links) Neste trabalho, foram utilizadas as vantagens da ferramenta matemática de análise temporal e espectral, a transformada wavelet discreta (DWT), além dos coeficientes de predição linear (LPC) e do algoritmo de inteligência artificial, Least Squares Support Vector Machines (LS-SVM), para aplicações em análise de sinais de voz e classificação de vozes patológicas. Inúmeros trabalhos na literatura têm demonstrado o grande interesse existente por ferramentas auxiliares ao diagnóstico de patologias da laringe. Os componentes da DWT forneceram parâmetros de medida para a análise e classificação das vozes patológicas, principalmente aquelas provenientes de pacientes com edema de Reinke e nódulo nas pregas vocais. O banco de dados com as vozes patológicas foi obtido do Departamento de Otorrinolaringologia e Cirurgia de Cabeça e Pescoço do Hospital das Clínicas da Faculdade de Medicina de Ribeirão Preto (FMRP-USP). Utilizando-se o algoritmo de reconhecimento de padrões, LS-SVM, mostrou-se que a combinação dos componentes da DWT de Daubechies com o filtro LP inverso levou a um classificador de bom desempenho alcançando mais de 90% de acerto na classificação das vozes patológicas. / The main objective of this work was to use the advantages of the time-frequency analysis mathematical tool, discrete wavelet transform (DWT), besides the linear prediction coefficients (LPC) and the artificial intelligence algorithm, Least Squares Support Vector Machines (LS-SVM), for applications in voice signal analysis and classification of pathological voices. A large number of works in the literature has been shown that there is a great interest for auxiliary tools to the diagnosis of laryngeal pathologies. DWT components gave measure parameters for the analysis and classification of pathological voices, mainly that ones from patients with Reinke\'s edema and nodule in the vocal folds. It was used a data bank with pathological voices from the Otolaryngology and the Head and Neck Surgery sector of the Clinical Hospital of the Faculty of Medicine at Ribeirão Preto, University of Sao Paulo (FMRP-USP), Brazil. Using the automatic learning algorithm applied in pattern recognition problems, LS-SVM, results have showed that the combination of Daubechies\' DWT components and inverse LP filter leads to a classifier with good performance reaching more than 90% of accuracy in the classification of the pathological voices. Classificador support vector machines Discrete wavelet transform Filtro inverso de predição linear Linear prediction inverse filter Pathological voices Support vector machines classifier Transformada wavelet Vozes patológicas

Search results