11 |
Classificação de Proteínas usando Máquinas de Aprendizagem e Descoberta de Padrõesdo Nascimento Júnior, Francisco 31 January 2008 (has links)
Made available in DSpace on 2014-06-12T15:51:20Z (GMT). No. of bitstreams: 1
license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5)
Previous issue date: 2008 / Máquinas de aprendizagem têm sido aplicadas em diferentes problemas em Bioinformática.
Similarmente, algoritmos de descoberta de padrões também têm sido usados para descobrir
motifs em seqüências de proteínas, contribuindo na definição de assinaturas (tais como
impressões digitais) que caracterizam classes funcionais de proteínas. Como por exemplo, a
classe de receptores acoplados a proteína-G (GPCR) que representam uma das maiores famílias
no Genoma Humano. Esta família é um dos grandes alvos de pesquisa para a descoberta
e desenvolvimento de novas drogas, conseqüentemente, de grande interesse para a indústria
farmacêutica. O modelo proposto nesta dissertação combina máquinas de aprendizagem, como
SVM (Support Vector Machine) e MLP (Multilayer Perceptron), e métodos de descoberta de
padrões no desenvolvimento de um procedimento para predizer a relação entre uma seqüência
primária de proteínas e sua classe funcional. Como caso de estudo, este trabalho apresenta
experimentos com a superfamília GPCR, usando padrões em forma de expressões regulares
desta família extraídos pelo SPEXS (Sequence Pattern EXhaustive Search), um algoritmo para
descoberta de padrões
|
12 |
Label Noise Cleaning Using Support Vector MachinesEkambaram, Rajmadhan 11 February 2016 (has links)
Mislabeled examples affect the performance of supervised learning algorithms. Two novel approaches to this problem are presented in this Thesis. Both methods build on the hypothesis that the large margin and the soft margin principles of support vector machines provide the characteristics to select mislabeled examples. Extensive experimental results on several datasets support this hypothesis. The support vectors of the one-class and two-class SVM classifiers captures around 85% and 99% of the randomly generated label noise examples (10% of the training data) on two character recognition datasets. The numbers of examples that need to be reviewed can be reduced by creating a two-class SVM classifier with the non-support vector examples, and then by only reviewing the support vector examples based on their classification score from the classifier. Experimental results on four datasets show that this method removes around 95% of the mislabeled examples by reviewing only around about 14% of the training data. The parameter independence of this method is also verified through the experiments. All the experimental results show that most of the label noise examples can be removed by (re-)examining the selective support vector examples. This property can be very useful while building large labeled datasets.
|
13 |
Novel Techniques for Processing Data with an FMCW radarNull, Thomas C 17 August 2013 (has links)
This dissertation examines and analyzes novel techniques that are useful in the collection and processing of data from a Frequency Modulated Continuous Wave Radar. The major topics discussed in this work are: reduction of amplitude modulation, signature collection without an anechoic chamber, transforming a signature into a matched filter, accounting for electromagnetic interference, accounting for digital noise, and the application of a Support Vector Machine to achieve classification. In addition, this work also provides a broad overview of a framework specifically developed to improve detection and classification without requiring expensive hardware modification. The four main categories analyzed in this work are distortion, spectral signature, optimal detection, and classification. Some notable contributions in this work include the assessment of a novel technique’s effectiveness to improve model accuracy by accounting for amplitude modulation in an FMCW radar, as well as discussion of improved techniques to perform signature collection with an FMCW radar in the absence of an anechoic chamber. The signature collection technique is a novel approach that utilizes physics and wavelets in an effort to improve Signal to Noise Ratio (SNR). This work also considers a novel technique to convert an FMCW target signature into coefficients for a matched filter, thus allowing for the full mathematical application of the optimal matched filter. In addition, this work provides an analysis of the improved performance of an FMCW radar through the development and use of a novel technique to account for both electromagnetic interference and digital noise. Finally the initial discovery, development, and refinement of an innovative application using SVM to classify the matched filter results of FMCW radar targets is given, thus resulting in previously uncollected and undocumented viable baseline data.
|
14 |
Second-order Cyclostationary Feature Based Detection of WiMAX Signals in Pulsed Noise EnvironmentsDavis, Joseph M. 05 December 2013 (has links)
Spectral coexistence and cooperative spectrum sharing techniques are vital to the continued development and proliferation of wireless communications systems. Government directives indicate that certain frequency bands which once were reserved for radar-only applications must now support wireless broadband systems. The effect of co-site interference upon detection techniques for wireless broadband systems is evaluated. Cyclostationary feature based detection methods are evaluated against gaussian noise and interfering radar signals. Alternative decision algorithms utilizing support vector machines are proposed and evaluated and compared against traditional general likelihood ratio test algorithms. Recommendations for certain algorithms and observation window lengths to maximize e ectiveness and minimize computational complexity are developed. / Office of Naval Research grant N00014-12-1-0062 and contract N00014-12-C-0702 / Master of Science
|
15 |
Automatické označování obrázků / Automatic Image LabellingSýkora, Michal January 2012 (has links)
This work focuses on automatic classification of images into semantic classes based on their contentc, especially in using SVM classifiers. The main objective of this work is to improve classification accuracy on large datasets. Both linear and nonlinear SVM classifiers are considered. In addition, the possibility of transforming features by Restricted Boltzmann Machines and using linear SVM is explored as well. All these approaches are compared in terms of accuracy, computational demands, resource utilization, and possibilities for future research.
|
16 |
Étude de précision et de performance du processus de classification d'images de phytoplancton à l'aide de machines à vecteurs de supportMorin, Eugène January 2014 (has links)
Ce projet de recherche cible l’étude et l’amélioration de la précision de la classification d’images de phytoplancton et la diminution du temps de traitement moyen requis par image. Deux solutions de classification sont proposées pour atteindre ces objectifs. La première solution vise à effectuer la classification d’images en passant par les phases de prétraitement, de discrimination et de classification, et la deuxième solution utilise uniquement les phases de prétraitement et de classification.
En résumé, la phase de prétraitement manipule une image en vue de caractériser l’élément principal (le phytoplancton), la phase de discrimination utilise les arbres décisionnels à intervalles pour éliminer les catégories ayant peu ou pas de similitude avec l’image traitée et finalement, la phase de classification se sert de machines à vecteurs de support (SVM) pour prédire une catégorie d’appartenance à chaque image traitée.
À la base, il y a un appareil de capture automatisée d’images qui transmet celles-ci à un classificateur. Selon la vitesse de classification, une portion ou l’ensemble des images générées seront classifiés. Donc, plus le nombre d’échantillons à classifier est grand, meilleure est l’approximation de la population de chaque groupe de phytoplanctons, à un temps donné. Le but étant d’obtenir une analyse qualitative, quantitative et temporelle plus précise de ce micro-organisme.
Pour permettre la classification de ce type d’image, un logiciel nommé Biotaxis a été développé. Celui-ci offre à l’utilisateur l’option de choisir parmis les deux solutions de classification proposées ci-haut. Toutes deux débutent par l’entraînement d’un groupe de classification, qui est composé de plusieurs catégories d’image, suivi par des tests de classification, qui sont effectués sur ce groupe pour vérifier la précision de la classification des catégories d’image qui le compose. Pour entraîner et tester le classificateur du logiciel
Biotaxis, deux ensembles d’images ont été employés. L’un d’eux sert uniquement à l’entrainement de groupes de classification et le second à tester ces derniers.
Les résultats obtenus dans ce projet de recherche ont permis de confirmer la validité des deux solutions proposées. Il fut possible d’atteindre une précision de la classification moyenne de 87 % et plus avec des groupes de classification de 13 catégories et moins. De plus, un temps de traitement moyen inférieur à 200 ms par image a été réalisé à partir de ces mêmes groupes de classification.
Le logiciel Biotaxis est proposé comme une nouvelle solution pour classifier rapidement des images de phytoplancton.
|
17 |
A Machine Learning Approach to Determine Oyster Vessel BehaviorFrey, Devin 16 December 2016 (has links)
A support vector machine (SVM) classifier was designed to replace a previous classifier which predicted oyster vessel behavior in the public oyster grounds of Louisiana. The SVM classifier predicts vessel behavior (docked, poling, fishing, or traveling) based on each vessel’s speed and either net speed or movement angle. The data from these vessels was recorded by a Vessel Monitoring System (VMS), and stored in a PostgreSQL database. The SVM classifier was written in Python, using the scikit-learn library, and was trained by using predictions from the previous classifier. Several validation and parameter optimization techniques were used to improve the SVM classifier’s accuracy. The previous classifier could classify about 93% of points from July 2013 to August 2014, but the SVM classifier can classify about 99.7% of those points. This new classifier can easily be expanded with additional features to further improve its predictive capabilities.
|
18 |
An Embedded Seizure Onset Detection SystemKindle, Alexander Lawrence 12 September 2013 (has links)
"A combined hardware and software platform for ambulatory seizure onset detection is presented. The hardware is developed around commercial off-the-shelf components, featuring ADS1299 analog front ends for electroencephalography from Texas Instruments and a Broadcom ARM11 microcontroller for algorithm execution. The onset detection algorithm is a patient-specific support vector machine algorithm. It outperforms a state-of-the-art detector on a reference data set, with 100% sensitivity, 3.4 second average onset detection latency, and on average 1 false positive per 24 hours. The more comprehensive European Epilepsy Database is then evaluated, which highlights several real-world challenges for seizure onset detection, resulting in reduced average sensitivity of 93.5%, 5 second average onset detection latency, and 85.5% specificity. Algorithm enhancements to improve this reduced performance are proposed."
|
19 |
Multiclass Classification of SRBCTsYeo, Gene, Poggio, Tomaso 25 August 2001 (has links)
A novel approach to multiclass tumor classification using Artificial Neural Networks (ANNs) was introduced in a recent paper cite{Khan2001}. The method successfully classified and diagnosed small, round blue cell tumors (SRBCTs) of childhood into four distinct categories, neuroblastoma (NB), rhabdomyosarcoma (RMS), non-Hodgkin lymphoma (NHL) and the Ewing family of tumors (EWS), using cDNA gene expression profiles of samples that included both tumor biopsy material and cell lines. We report that using an approach similar to the one reported by Yeang et al cite{Yeang2001}, i.e. multiclass classification by combining outputs of binary classifiers, we achieved equal accuracy with much fewer features. We report the performances of 3 binary classifiers (k-nearest neighbors (kNN), weighted-voting (WV), and support vector machines (SVM)) with 3 feature selection techniques (Golub's Signal to Noise (SN) ratios cite{Golub99}, Fisher scores (FSc) and Mukherjee's SVM feature selection (SVMFS))cite{Sayan98}.
|
20 |
Developing Optical Character Recoginition for Ethiopic ScriptsDemissie, Fitsum January 2011 (has links)
The Amharic language is the Official language of over 70 million people mainly in Ethiopia. An extensive literature survey and the government report reveal no single Amharic character recognition is found in the country. The Amharic script has 33 basic characters each with seven orders giving 310 distinct characters, including numbers and punctuation symbols. The characters are visually similar; there is a typeface, but no capitalization. Beside this there is no any standard font to use the language in the computer but they use different fonts developed by different stakeholders without keeping a standard on their own way and interest and this create a problem of incompatibility between different fonts and documents.This project is to investigate the reason why Amharic optical character recognition is not addressed by local and international researchers and developers and finally to develop Amharic optical character recognition uses the features and facilities of Microsoft windows Vista or 7 using Unicode standard.
|
Page generated in 0.036 seconds