Spelling suggestions: "subject:"supportvector machines"" "subject:"supportvector achines""
51 |
End-to-end single-rate multicast congestion detection using support vector machinesLiu, Xiaoming January 2008 (has links)
>Magister Scientiae - MSc / IP multicast is an efficient mechanism for simultaneously transmitting bulk data to multiple receivers. Many applications can benefit from multicast, such as audio and videoconferencing, multi-player games, multimedia broadcasting, distance education, and data replication. For either technical or policy reasons, IP multicast still has not yet been deployed in today’s Internet. Congestion is one of the most important issues impeding the development and deployment of IP multicast and multicast applications.
|
52 |
Evolutionary Optimization Of Support Vector MachinesGruber, Fred 01 January 2004 (has links)
Support vector machines are a relatively new approach for creating classifiers that have become increasingly popular in the machine learning community. They present several advantages over other methods like neural networks in areas like training speed, convergence, complexity control of the classifier, as well as a stronger mathematical background based on optimization and statistical learning theory. This thesis deals with the problem of model selection with support vector machines, that is, the problem of finding the optimal parameters that will improve the performance of the algorithm. It is shown that genetic algorithms provide an effective way to find the optimal parameters for support vector machines. The proposed algorithm is compared with a backpropagation Neural Network in a dataset that represents individual models for electronic commerce.
|
53 |
Studies on support vector machines and applications to video object extractionLiu, Yi 22 September 2006 (has links)
No description available.
|
54 |
A Machine Learning Approach to Predict Gene Regulatory Networks in Seed Development in Arabidopsis Using Time Series Gene Expression DataNi, Ying 08 July 2016 (has links)
Gene regulatory networks (GRNs) provide a natural representation of relationships between regulators and target genes. Though inferring GRN is a challenging task, many methods, including unsupervised and supervised approaches, have been developed in the literature. However, most of these methods target non-context-specific GRNs. Because the regulatory relationships consistently reprogram under different tissues or biological processes, non-context-specific GRNs may not fit some specific conditions. In addition, a detailed investigation of the prediction results has remained elusive. In this study, I propose to use a machine learning approach to predict GRNs that occur in developmental stage-specific networks and to show how it improves our understanding of the GRN in seed development.
I developed a Beacon GRN inference tool to predict a GRN in seed development in Arabidopsis based on a support vector machine (SVM) local model. Using the time series gene expression levels in seed development and prior known regulatory relationships, I evaluated and predicted the GRN at this specific biological process. The prediction results show that one gene may be controlled by multiple regulators. The targets that are strongly positively correlated with their regulators are mostly expressed at the beginning of seed development. The direct targets were detected when I found a match between the promoter regions of the targets and the regulator's binding sequence. Our prediction provides a novel testable hypotheses of a GRN in seed development in Arabidopsis, and the Beacon GRN inference tool provides a valuable model system for context-specific GRN inference. / Master of Science
|
55 |
Algorithm to enable intelligent rail break detectionBhaduri, Sreyoshi 04 February 2014 (has links)
Wavelet intensity based algorithm developed previously at VirginiaTech has been furthered and paired with an SVM based classifier. The wavelet intensity algorithm acts as a feature extraction algorithm. The wavelet transform is an effective tool as it allows one to narrow down upon the transient, high frequency events and is able to tell their exact location in time. According to prior work done in the field of signal processing, the local regularities of a signal can be estimated using a Lipchitz exponent at each time step of the signal. The local Lipchitz exponent can then be used to generate the wavelet intensity factor values.
For each vertical acceleration value, corresponding to a specific location on the track, we now have a corresponding intensity factor. The intensity factor corresponds to break-no break information and can now be used as a feature to classify the vertical acceleration as a fault or no fault. Support Vector Machines (SVM) is used for this binary classification task. SVM is chosen as it is a well-studied topic with efficient implementations available. SVM instead of hard threshold of the data is expected to do a better job of classification without increasing the complexity of the system appreciably. / Master of Science
|
56 |
Modelagem da produtividade da cultura da cana de açúcar por meio do uso de técnicas de mineração de dados / Modeling sugarcane yield through Data Mining techniquesHammer, Ralph Guenther 27 July 2016 (has links)
O entendimento da hierarquia de importância dos fatores que influenciam a produtividade da cana de açúcar pode auxiliar na sua modelagem, contribuindo assim para a otimização do planejamento agrícola das unidades produtoras do setor, bem como no aprimoramento das estimativas de safra. Os objetivos do presente estudo foram a ordenação das variáveis que condicionam a produtividade da cana de açúcar, de acordo com a sua importância, bem como o desenvolvimento de modelos matemáticos de produtividade da cana de açúcar. Para tanto, foram utilizadas três técnicas de mineração de dados nas análises de bancos de dados de usinas de cana de açúcar no estado de São Paulo. Variáveis meteorológicas e de manejo agrícola foram submetidas às análises por meio das técnicas Random Forest, Boosting e Support Vector Machines, e os modelos resultantes foram testados por meio da comparação com dados independentes, utilizando-se o coeficiente de correlação (r), índice de Willmott (d), índice de confiança de Camargo (C), erro absoluto médio (EAM) e raíz quadrada do erro médio (RMSE). Por fim, comparou-se o desempenho dos modelos gerados com as técnicas de mineração de dados com um modelo agrometeorológico, aplicado para os mesmos bancos de dados. Constatou-se que, das variáveis analisadas, o número de cortes foi o fator mais importante em todas as técnicas de mineração de dados. A comparação entre as produtividades estimadas pelos modelos de mineração de dados e as produtividades observadas resultaram em RMSE variando de 19,70 a 20,03 t ha-1 na abordagem mais geral, que engloba todas as regiões do banco de dados. Com isso, o desempenho preditivo foi superior ao modelo agrometeorológico, aplicado no mesmo banco de dados, que obteve RMSE ≈ 70% maior (≈ 34 t ha-1). / The understanding of the hierarchy of the importance of the factors which influence sugarcane yield can subsidize its modeling, thus contributing to the optimization of agricultural planning and crop yield estimates. The objectives of this study were to ordinate the variables which condition the sugarcane yield, according to their relative importance, as well as the development of mathematical models for predicting sugarcane yield. For this, three Data Mining techniques were applied in the analyses of data bases of several sugar mills in the State of São Paulo, Brazil. Meteorological and crop management variables were analyzed through the Data Mining techniques Random Forest, Boosting and Support Vector Machines, and the resulting models were tested through the comparison with an independent data set, using the coefficient of correlation (r), Willmott index (d), confidence index of Camargo (c), mean absolute error (MAE), and root mean square error (RMSE). Finally, the predictive performances of these models were compared with the performance of an agrometeorological model, applied in the same data set. The results allowed to conclude that, within all the variables, the number of cuts was the most important factor considered by all Data Mining models. The comparison between the observed yields and those estimated by the Data Mining techniques resulted in a RMSE ranging between 19,70 to 20,03 t ha-1, in the general method, which considered all regions of the data base. Thus, the predictive performances of the Data Mining algorithms were superior to that of the agrometeorological model, which presented RMSE ≈ 70% higher (≈ 34 t ha-1).
|
57 |
Modelagem da produtividade da cultura da cana de açúcar por meio do uso de técnicas de mineração de dados / Modeling sugarcane yield through Data Mining techniquesRalph Guenther Hammer 27 July 2016 (has links)
O entendimento da hierarquia de importância dos fatores que influenciam a produtividade da cana de açúcar pode auxiliar na sua modelagem, contribuindo assim para a otimização do planejamento agrícola das unidades produtoras do setor, bem como no aprimoramento das estimativas de safra. Os objetivos do presente estudo foram a ordenação das variáveis que condicionam a produtividade da cana de açúcar, de acordo com a sua importância, bem como o desenvolvimento de modelos matemáticos de produtividade da cana de açúcar. Para tanto, foram utilizadas três técnicas de mineração de dados nas análises de bancos de dados de usinas de cana de açúcar no estado de São Paulo. Variáveis meteorológicas e de manejo agrícola foram submetidas às análises por meio das técnicas Random Forest, Boosting e Support Vector Machines, e os modelos resultantes foram testados por meio da comparação com dados independentes, utilizando-se o coeficiente de correlação (r), índice de Willmott (d), índice de confiança de Camargo (C), erro absoluto médio (EAM) e raíz quadrada do erro médio (RMSE). Por fim, comparou-se o desempenho dos modelos gerados com as técnicas de mineração de dados com um modelo agrometeorológico, aplicado para os mesmos bancos de dados. Constatou-se que, das variáveis analisadas, o número de cortes foi o fator mais importante em todas as técnicas de mineração de dados. A comparação entre as produtividades estimadas pelos modelos de mineração de dados e as produtividades observadas resultaram em RMSE variando de 19,70 a 20,03 t ha-1 na abordagem mais geral, que engloba todas as regiões do banco de dados. Com isso, o desempenho preditivo foi superior ao modelo agrometeorológico, aplicado no mesmo banco de dados, que obteve RMSE ≈ 70% maior (≈ 34 t ha-1). / The understanding of the hierarchy of the importance of the factors which influence sugarcane yield can subsidize its modeling, thus contributing to the optimization of agricultural planning and crop yield estimates. The objectives of this study were to ordinate the variables which condition the sugarcane yield, according to their relative importance, as well as the development of mathematical models for predicting sugarcane yield. For this, three Data Mining techniques were applied in the analyses of data bases of several sugar mills in the State of São Paulo, Brazil. Meteorological and crop management variables were analyzed through the Data Mining techniques Random Forest, Boosting and Support Vector Machines, and the resulting models were tested through the comparison with an independent data set, using the coefficient of correlation (r), Willmott index (d), confidence index of Camargo (c), mean absolute error (MAE), and root mean square error (RMSE). Finally, the predictive performances of these models were compared with the performance of an agrometeorological model, applied in the same data set. The results allowed to conclude that, within all the variables, the number of cuts was the most important factor considered by all Data Mining models. The comparison between the observed yields and those estimated by the Data Mining techniques resulted in a RMSE ranging between 19,70 to 20,03 t ha-1, in the general method, which considered all regions of the data base. Thus, the predictive performances of the Data Mining algorithms were superior to that of the agrometeorological model, which presented RMSE ≈ 70% higher (≈ 34 t ha-1).
|
58 |
Αναγνώριση γονιδιακών εκφράσεων νεοπλασιών σε microarrays / Identification of tumor gene expression from microarraysΤσακανίκας, Παναγιώτης 16 May 2007 (has links)
Η ουσιώδης ανάπτυξη που παρουσίασε η μοριακή παθολογία τα τελευταία χρόνια, είναι συνυφασμένη με την ανάπτυξη της microarray τεχνολογίας. Αυτή η τεχνολογία μας παρέχει μια νέα οδό προσπέλασης υψηλής χωρητικότητας τέτοια ώστε: i. να δίνεται η δυνατότητα ανάλυσης μεγάλης κλίμακας της ισοτοπικής αφθονίας του αγγελιοφόρου RNA (mRNA), ως δείκτη γονιδιακών εκφράσεων (cDNA arrays), ii. να ανιχνεύονται πολυμορφισμοί ή μεταλλάξεις μέσα σε έναν πληθυσμό γονιδίων χρησιμοποιώντας ξεχωριστούς nucleotide πολυμορφισμούς (Single Nucleotide Polymorphisms arrays), iii. και για εξέταση «απώλειας» ή «κέρδους», ή αλλαγές στον αριθμό αντιγραφής κάποιου συγκεκριμένου γονιδίου που σχετίζεται με κάποια ασθένεια (CGH arrays). Η τεχνολογία των microarrays είναι ευλογοφανές να εξελιχθεί σε ακρογωνιαίο λίθο της μοριακής έρευνας στα επόμενα χρόνια, και αυτό γιατί DNA microarrays χρησιμοποιούνται για τον ποσοτικό προσδιορισμό δεκάδων χιλιάδων DNA ή RNA ακολουθιών σε μια και μόνο ανάλυση – πείραμα. Από μια σειρά από τέτοια πειράματα, είναι δυνατόν να προσδιορίσουμε τους μηχανισμούς που ελέγχουν την ενεργοποίηση των γονιδίων σε έναν οργανισμό. Ακόμη η χρήση των microarrays για την επισκόπηση γονιδιακών εκφράσεων είναι μια ραγδαία αναπτυσσόμενη τεχνολογία, η οποία μετακινήθηκε από εξειδικευμένα, σε συμβατικά βιολογικά εργαστήρια. Στην παρούσα διπλωματική εργασία κυρίως, θα αναφερθούμε στα δύο γενικά στάδια της ανάλυσης των microarray εικόνων που μας δίνονται ως απόρρεια των πειραμάτων, που έχουν ως στόχο την εξόρυξη πληροφορίας από αυτές. Τα δύο αυτά στάδια είναι: i. Επεξεργασία εικόνας και εξαγωγή πληροφορίας από αυτήν. ii. Ανάλυση της προκύπτουσας πληροφορίας και αναγνώριση των γονιδιακών εκφράσεων. Όσον αφορά το πρώτο στάδιο θα αναφέρουμε τις βασικές μεθόδους που χρησιμοποιούνται σήμερα από εμπορικά και εκπαιδευτικά πακέτα λογισμικού, οι οποίες μέθοδοι δίνουν και τα καλύτερα αποτελέσματα μέχρι στιγμής. Για το δεύτερο στάδιο, αφού αναφέρουμε τις πιο σημαντικές μεθόδους που χρησιμοποιούνται, θα υλοποιήσουμε μια δική μας μέθοδο και θα την συγκρίνουμε με τις υπάρχουσες.
|
59 |
"Investigação de estratégias para a geração de máquinas de vetores de suporte multiclasses" / Investigation of strategies for the generation of multiclass support vector machinesLorena, Ana Carolina 16 February 2006 (has links)
Diversos problemas envolvem a classificação de dados em categorias, também denominadas classes. A partir de um conjunto de dados cujas classes são conhecidas, algoritmos de Aprendizado de Máquina (AM) podem ser utilizados na indução de um classificador capaz de predizer a classe de novos dados do mesmo domínio, realizando assim a discriminação desejada. Dentre as diversas técnicas de AM utilizadas em problemas de classificação, as Máquinas de Vetores de Suporte (Support Vector Machines - SVMs) se destacam por sua boa capacidade de generalização. Elas são originalmente concebidas para a solução de problemas com apenas duas classes, também denominados binários. Entretanto, diversos problemas requerem a discriminação dos dados em mais que duas categorias ou classes. Nesta Tese são investigadas e propostas estratégias para a generalização das SVMs para problemas com mais que duas classes, intitulados multiclasses. O foco deste trabalho é em estratégias que decompõem o problema multiclasses original em múltiplos subproblemas binários, cujas saídas são então combinadas na obtenção da classificação final. As estratégias propostas visam investigar a adaptação das decomposições a cada aplicação considerada, a partir de informações do desempenho obtido em sua solução ou extraídas de seus dados. Os algoritmos implementados foram avaliados em conjuntos de dados gerais e em aplicações reais da área de Bioinformática. Os resultados obtidos abrem várias possibilidades de pesquisas futuras. Entre os benefícios verificados tem-se a obtenção de decomposições mais simples, que requerem menos classificadores binários na solução multiclasses. / Several problems involve the classification of data into categories, also called classes. Given a dataset containing data whose classes are known, Machine Learning (ML) algorithms can be employed for the induction of a classifier able to predict the class of new data from the same domain, thus performing the desired discrimination. Among the several ML techniques applied to classification problems, the Support Vector Machines (SVMs) are known by their high generalization ability. They are originally conceived for the solution of problems with only two classes, also named binary problems. However, several problems require the discrimination of examples into more than two categories or classes. This thesis investigates and proposes strategies for the generalization of SVMs to problems with more than two classes, known as multiclass problems. The focus of this work is on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are then combined to obtain the final classification. The proposed strategies aim to investigate the adaptation of the decompositions for each multiclass application considered, using information of the performance obtained for its solution or extracted from its examples. The implemented algorithms were evaluated on general datasets and on real applications from the Bioinformatics domain. The results obtained open possibilities of many future work. Among the benefits observed is the obtainment of simpler decompositions, which require less binary classifiers in the multiclass solution.
|
60 |
"Classificação de páginas na internet" / "Internet pages classification"Martins Júnior, José 11 April 2003 (has links)
O grande crescimento da Internet ocorreu a partir da década de 1990 com o surgimento dos provedores comerciais de serviços, e resulta principalmente da boa aceitação e vasta disseminação do uso da Web. O grande problema que afeta a escalabilidade e o uso de tal serviço refere-se à organização e à classificação de seu conteúdo. Os engenhos de busca atuais possibilitam a localização de páginas na Web pela comparação léxica de conjuntos de palavras perante os conteúdos dos hipertextos. Tal mecanismo mostra-se ineficaz quando da necessidade pela localização de conteúdos que expressem conceitos ou objetos, a exemplo de produtos à venda oferecidos em sites de comércio eletrônico. A criação da Web Semântica foi anunciada no ano de 2000 para esse propósito, visando o estabelecimento de novos padrões para a representação formal de conteúdos nas páginas Web. Com sua implantação, cujo prazo inicialmente previsto foi de dez anos, será possível a expressão de conceitos nos conteúdos dos hipertextos, que representarão objetos classificados por uma ontologia, viabilizando assim o uso de sistemas, baseados em conhecimento, implementados por agentes inteligentes de software. O projeto DEEPSIA foi concebido como uma solução centrada no comprador, ao contrário dos atuais Market Places, para resolver o problema da localização de páginas Web com a descrição de produtos à venda, fazendo uso de métodos de classificação de textos, apoiados pelos algoritmos k-NN e C4.5, no suporte ao processo decisório realizado por um agente previsto em sua arquitetura, o Crawler Agent. Os testes com o sistema em sites brasileiros denotaram a necessidade pela sua adaptação em diversos aspectos, incluindo-se o processo decisório envolvido, que foi abordado pelo presente trabalho. A solução para o problema envolveu a aplicação e a avaliação do método Support Vector Machines, e é descrita em detalhes. / The huge growth of the Internet has been occurring since 90s with the arrival of the internet service providers. One important reason is the good acceptance and wide dissemination of the Web. The main problem that affects its scalability and usage is the organization and classification of its content. The current search engines make possible the localization of pages in the Web by means of a lexical comparison among sets of words and the hypertexts contents. In order to find contents that express concepts or object, such as products for sale in electronic commerce sites such mechanisms are inefficient. The proposition of the Semantic Web was announced in 2000 for this purpose, envisioning the establishment of new standards for formal contents representation in the Web pages. With its implementation, whose deadline was initially stated for ten years, it will be possible to express concepts in hypertexts contents, that will fully represent objects classified into an ontology, making possible the use of knowledge based systems implemented by intelligent softwares agents. The DEEPSIA project was conceived as a solution centered in the purchaser, instead of current Market Places, in order to solve the problem of finding Web pages with products for sale description, making use of methods of text classification, with k-NN and C4.5 algorithms, to support the decision problem to be solved by an specific agent designed, the Crawler Agent. The tests of the system in Brazilian sites have denoted the necessity for its adaptation in many aspects, including the involved decision process, which was focused in present work. The solution for the problem includes the application and evaluation of the Support Vector Machines method, and it is described in detail.
|
Page generated in 0.0932 seconds