Global ETD Search

271	A cloud-based intelligent and energy efficient malware detection framework : a framework for cloud-based, energy efficient, and reliable malware detection in real-time based on training SVM, decision tree, and boosting using specified heuristics anomalies of portable executable files Mirza, Qublai K. A. January 2017 (has links) The continuity in the financial and other related losses due to cyber-attacks prove the substantial growth of malware and their lethal proliferation techniques. Every successful malware attack highlights the weaknesses in the defence mechanisms responsible for securing the targeted computer or a network. The recent cyber-attacks reveal the presence of sophistication and intelligence in malware behaviour having the ability to conceal their code and operate within the system autonomously. The conventional detection mechanisms not only possess the scarcity in malware detection capabilities, they consume a large amount of resources while scanning for malicious entities in the system. Many recent reports have highlighted this issue along with the challenges faced by the alternate solutions and studies conducted in the same area. There is an unprecedented need of a resilient and autonomous solution that takes proactive approach against modern malware with stealth behaviour. This thesis proposes a multi-aspect solution comprising of an intelligent malware detection framework and an energy efficient hosting model. The malware detection framework is a combination of conventional and novel malware detection techniques. The proposed framework incorporates comprehensive feature heuristics of files generated by a bespoke static feature extraction tool. These comprehensive heuristics are used to train the machine learning algorithms; Support Vector Machine, Decision Tree, and Boosting to differentiate between clean and malicious files. Both these techniques; feature heuristics and machine learning are combined to form a two-factor detection mechanism. This thesis also presents a cloud-based energy efficient and scalable hosting model, which combines multiple infrastructure components of Amazon Web Services to host the malware detection framework. This hosting model presents a client-server architecture, where client is a lightweight service running on the host machine and server is based on the cloud. The proposed framework and the hosting model were evaluated individually and combined by specifically designed experiments using separate repositories of clean and malicious files. The experiments were designed to evaluate the malware detection capabilities and energy efficiency while operating within a system. The proposed malware detection framework and the hosting model showed significant improvement in malware detection while consuming quite low CPU resources during the operation.
272	Optimization algorithms for SVM classification : Applications to geometrical chromosome analysis / Algorithmes d'optimisation pour la classification via SVM : application à l'analyse géométrique des chromosomes Wang, Wenjuan 16 September 2016 (has links) Le génome est très organisé au sein du noyau cellulaire. Cette organisation et plus spécifiquement la localisation et la dynamique des gènes et chromosomes contribuent à l'expression génétique et la différenciation des cellules que ce soit dans le cas de pathologies ou non. L'exploration de cette organisation pourrait dans le futur aider à diagnostiquer et identifier de nouvelles cibles thérapeutiques. La conformation des chromosomes peut être analysée grâce au marquage ADN sur plusieurs sites et aux mesures de distances entre ces différents marquages fluorescents. Dans ce contexte, l'organisation spatiale du chromosome III de levure a montré que les deux types de cellules, MATa et MATalpha, sont différents. Par contre, les données issues de l'imagerie electronique sont bruitées à cause de la résolution des systèmes de microscope et du fait du caractère vivant des cellules observées. Dans cette thèse, nous nous intéressons au développement de méthodes de classification pour différencier les types de cellules sur la base de mesures de distances entre 3 loci du chromosome III et d'une estimation du bruit. Dans un premier temps, nous nous intéressons de façon générale aux problèmes de classification binaire à l'aide de SVM de grandes tailles et passons en revue les algorithmes d'optimisation stochastiques du premier ordre. Afin de prendre en compte les incertudes, nous proposons un modèle d'apprentissage qui ajuste sa robustesse en fonction du bruit. La méthode évite les situations où le modèle est trop conservatif et que l'on rencontre parfois avec les formulations SVM robustes. L'amplitude des pertubations liées au bruit qui sont incorporées dans le modèle est controllée par l'optimisation d'une erreur de généralisation. Aucune hypothèse n'est faite sur la distribution de probabilité du bruit. Seule une borne estimée des pertubations est nécessaire. Le problème peut s'écrire sous la forme d'un programme biniveaux de grande taille. Afin de le résoudre, nous proposons un algorithme biniveau qui réalise des déplacements stochastiques très peu coûteux et donc adapté aux problèmes de grandes tailles. La convergence de l'algorithme est prouvée pour une classe générale de problèmes. Nous présentons des résultats numériques très encourageants qui confirment que la technique est meilleure que l'approche SOCP (Second Order Cone Programming) pour plusieurs bases de données publiques. Les expériences numériques montrent également que la nonlinéarité additionnelle générée par l'incertitude sur les données pénalise la classification des chromosomes et motivent des recherches futures sur une version nonlinéaire de la technique proposée. Enfin, nous présentons également des résultats numériques de l'algorithme biniveau stochastique pour la sélection automatique de l'hyperparamètre de pénalité dans les SVM. L'approche évite les coûteux calculs que l'on doit inévitablement réaliser lorsque l'on effectue une validation croisée sur des problèmes de grandes tailles. / The genome is highly organized within the cell nucleus. This organization, in particular the localization and dynamics of genes and chromosomes, is known to contribute to gene expression and cell differentiation in normal and pathological contexts. The exploration of this organization may help to diagnose disease and to identify new therapeutic targets. Conformation of chromosomes can be analyzed by distance measurements of distinct fluorescently labeled DNA sites. In this context, the spatial organization of yeast chromosome III was shown to differ between two cell types, MATa and MATa. However, imaging data are subject to noise, due to microscope resolution and the living state of yeast cells. In this thesis, the aim is to develop new classification methods to discriminate two mating types of yeast cells based on distance measurements between three loci on chromosome III aided by estimation the bound of the perturbations. We first address the issue of solving large scale SVM binary classification problems and review state of the art first order optimization stochastic algorithms. To deal with uncertainty, we propose a learning model that adjusts its robustness to noise. The method avoids over conservative situations that can be encountered with worst case robust support vector machine formulations. The magnitude of the noise perturbations that is incorporated in the model is controlled by optimizing a generalization error. No assumption on the distribution of noise is taken. Only rough estimates of perturbations bounds are required. The resulting problem is a large scale bi-level program. To solve it, we propose a bi-level algorithm that performs very cheap stochastic gradient moves and is therefore well suited to large datasets. The convergence is proven for a class of general problems. We present encouraging experimental results confirming that the technique outperforms robust second order cone programming formulations on public datasets. The experiments also show that the extra nonlinearity generated by the uncertainty in the data penalizes the classification of chromosome data and advocates for further research on nonlinear robust models. Additionally, we provide the experimenting results of the bilevel stochastic algorithm used to perform automatic selection of the penalty parameter in linear and non-linear support vector machines. This approach avoids expensive computations that usually arise in k-fold cross validation. Algorithmes d'optimisation Incertitude des données Machine à vecteur de support Optimisation biniveau Algorithmes stochastiques Optimisation robuste Optimization algorithms Data uncertainty Support vector machine Bilevel optimization Stochastic algorithms Robust optimization
273	A Comparison of Machine Learning Techniques for Facial Expression Recognition Deaney, Mogammat Waleed January 2018 (has links) Magister Scientiae - MSc (Computer Science) / A machine translation system that can convert South African Sign Language (SASL) video to audio or text and vice versa would be bene cial to people who use SASL to communicate. Five fundamental parameters are associated with sign language gestures, these are: hand location; hand orientation; hand shape; hand movement and facial expressions. The aim of this research is to recognise facial expressions and to compare both feature descriptors and machine learning techniques. This research used the Design Science Research (DSR) methodology. A DSR artefact was built which consisted of two phases. The rst phase compared local binary patterns (LBP), compound local binary patterns (CLBP) and histogram of oriented gradients (HOG) using support vector machines (SVM). The second phase compared the SVM to arti cial neural networks (ANN) and random forests (RF) using the most promising feature descriptor\|HOG\|from the rst phase. The performance was evaluated in terms of accuracy, robustness to classes, robustness to subjects and ability to generalise on both the Binghamton University 3D facial expression (BU-3DFE) and Cohn Kanade (CK) datasets. The evaluation rst phase showed HOG to be the best feature descriptor followed by CLBP and LBP. The second showed ANN to be the best choice of machine learning technique closely followed by the SVM and RF.
274	Uso de inteligência artificial para estimativa da capacidade de suporte de carga do solo / Use of artificial intelligence to soil load support capacity estimate Pereira, Tonismar dos Santos 13 February 2017 (has links) The knowledge of the relationships between physical and mechanical properties of the soil may contribute to the development of pedotransfer functions (PTFs), to estimate other soil properties are difficult to measure. The objectives of this work were to estimate the preconsolidation pressure and soil resistance to penetration, using predictive methodologies, using data available in the literature, with physical-hydrological and mineralogical characteristics of soils. The development of PTFs was based on three modeling methods: (i) multiple linear regression (MLR), (ii) artificial neural networks (ANNs) and (iii) support vector machines (SVM). The first proposed methodology for the development of PTFs was the stepwise option of the IBM-SPSS 20.0® software. The models generated from the second methodology, ie RNA were implemented through the multilayer perceptron with backpropagation algorithm and Levenberg-Marquardt optimization of Matlab®2008b software, with variations of the number of neurons in the input layer and number of neurons In the middle layer. The third methodology was to generate PTFs from SVM that fit within the data mining process by exercising the Waikato Environment for Knowledge Analysis software (RapidMiner 5). The SVM training was performed by varying the number of input data, the kernel function and coefficients of these functions. Once the estimates were made, the performance indices (id) and classified according to Camargo and Sentelhas (1997) were calculated, thus comparing the methods between themselves and others already established. The obtained results showed that artificial intelligence models (RNA and MVS) are efficient and have predictive capacity superior to the established models, in data conditions of soils with textural classes and diverse managements, and similar, although with higher performance index values for Conditions of soils of the same textural class exposed to the same management. / O conhecimento das relações entre propriedades físicas e mecânicas do solo pode contribuir no desenvolvimento de funções de pedotransferência (FPTs), que permitam estimar outras propriedades do solo de difícil mensuração. Os objetivos deste trabalho foram estimar a pressão de preconsolidação e a resistência do solo à penetração, com o uso de metodologias de predição, utilizando-se de dados disponíveis na literatura, com valores de características físico-hídricas e mineralógicas dos solos. Os valores estimados foram obtidos a partir de três métodos de modelagem: (i) regressão linear múltipla (RLM), (ii) redes neurais artificiais (RNA) e (iii) máquinas de vetores de suporte (MVS). A primeira metodologia proposta para o desenvolvimento dos modelos preditivos foi a opção stepwise do software IBM-SPSS 20.0®. Os modelos geradas a partir da segunda metodologia, ou seja, das RNA foram implementadas através do perceptron multicamadas com algoritmo backpropagation e otimização Levenberg-Marquardt do software Matlab®2008b, efetuando-se variações do número de neurônios na camada de entrada e número de neurônios na camada intermediária. A terceira metodologia foi gerar FPTs a partir de MVS que se enquadra dentro dos processos de mineração de dados utilizando para tal o software Waikato Environment for Knowledge Analysis® (RapidMiner 5). O treinamento das MVS foi realizado variando-se o número de dados de entrada, a função kernel e coeficientes destas funções. Realizadas as estimativas, foram calculados os índices de desempenho (id) e classificados segundo Camargo e Sentelhas (1997), podendo-se assim comparar os métodos entre si e a outros já consagrados. Os resultados obtidos mostraram que modelos de inteligência artificial (RNA e MVS) são eficientes e possuem capacidade preditiva superior aos modelos consagrados, em condições de dados de solos com classes texturais e manejos diversos, e semelhantes ainda que com valores de índice de desempenho superiores para condições de solos de mesma classe textural expostos ao mesmo manejo. Pedofunções Compactação do solo Redes neurais artificiais Máquinas de vetores de suporte Inteligência artificial Pedofunctions Soil compaction Artificial neural networks Support vector machine Artificial intelligence
275	Uma abordagem de monitoramento dos sinais motores da doença de Parkinson baseada em jogos eletrônicos. MEDEIROS, Leonardo Melo de. 15 May 2018 (has links) Submitted by Kilvya Braga (kilvyabraga@hotmail.com) on 2018-05-15T12:49:12Z No. of bitstreams: 1 LEONARDO MELO DE MEDEIROS - TESE (PPGCC) 2016.pdf: 2301452 bytes, checksum: e08b1ce490fde57cff30bef5cb8e5b0e (MD5) / Made available in DSpace on 2018-05-15T12:49:12Z (GMT). No. of bitstreams: 1 LEONARDO MELO DE MEDEIROS - TESE (PPGCC) 2016.pdf: 2301452 bytes, checksum: e08b1ce490fde57cff30bef5cb8e5b0e (MD5) Previous issue date: 2016-06-03 / Capes / Os Sistemas de Monitoramento da Saúde (SMS) possibilitam aos médicos obterem informações sobre o estado de saúde de seus pacientes. Além disso, as identiﬁcações dos sintomas das doenças podem auxiliar no diagnóstico precoce e prevenir a ocorrência de situações críticas. Para acompanhar e avaliar a saúde motora de um paciente, é necessário realizar uma avaliação motora por meio de movimentos especíﬁcos. Isto diﬁculta a concepção de um SMS de dados motores não-invasivo e engajados na rotina diária dos pacientes. A abordagem apresentada nesta tese, utiliza os jogos eletrônicos como fator motivacional para o fornecimento dos dados motores. Durante o jogo, o usuário é induzido a executar movimentos relevantes, de modo que um sensor de movimento possa adquiri-los e quantiﬁcá-los. Este ambiente lúdico, de jogo eletrônico, abstrai o usuário do contexto de tratamento da saúde e incentiva a execução dos movimentos de uma maneira mais natural do que a imposta por um exame clínico. Para avaliar esta abordagem, foi desenvolvido um jogo com a arquitetura proposta para identiﬁcar sintomas motores relacionadas com a Doença de Parkinson. Num estudo de caso controle, foram avaliados os movimentos angulares dos braços para quantiﬁcar as habilidades motoras desses grupos. Os dados coletados foram processados e aplicados a uma Máquina de Vetor de Suporte(SVM) para classiﬁcar a ocorrência do sintoma da bradicinesia do Parkinson. Obteve-se uma classiﬁcação com uma acurácia de 86,67% e falsos positivos de6,67%. Além disso, em um experimento para avaliação da aceitação dos usuários, 90% ﬁcaram motivados com o jogo desenvolvido e aﬁrmaram que integrariam o SMS em sua rotina diária. Estes resultados demonstram que a abordagem de monitoramento baseado em jogos, apresentada nesta tese, tem potencial para ser um SMS para monitoramentos dos sintomas motores. / Health Monitoring Systems (HMS) allow doctors to gain a better picture of their patient’s health status. An early identiﬁcation of symptoms can support the disease’s diagnostic and prevent critical situations. In order to monitor a patient’s motor abilities, it is necessary to record and evaluate speciﬁc movements. This makes it difﬁcult to design a HMS that non-obtrusively integrates into the patient’s daily routine. The approach presented in this thesis makes use of the motivational power of electronic games. While playing the game, the user is incited to make the relevant movements, so that an optical sensor can detect and measure them. The playful situation distracts the user from thinking about health issues and therefore encourages more natural movements with improved validity for a health examination. To evaluate this approach, a game has been developed and employed to detect Parkinson related motor symptoms. In a study with patients diagnosed as affected by the Parkinson Disease and a healthy control group, the angular movements of the arms were used to measure motor abilities. The data was then processed and applied to a Support Vector Machine (SVM) to predict, based on the detected movements, whether a subject shows Parkinson related symptoms or should be classiﬁed as healthy. The system classiﬁed the subjects with an accuracy of 86.67% and arate of 6.67% false positives. Furthermore, the use racceptance of the game-based approach was studied and showed that 90% of the users felt motivated to play the game as part of their daily routine. These results demonstrate that the game-based approach presented in this thesis has the potential to become a base for HMS that monitor motor symptoms. Ciência da Computação Sistemas de Monitoramento da Saúde Jogos para Saúde Doença de Parkinson Máquina de Vetor de Suporte Health Monitoring Systems Health Games Parkinson Disease Support Vector Machine
276	Classificador m?quina de suporte vetorial com an?lise de Fourier aplicada em dados de EEG e EMG Carvalho, Jhonnata Bezerra de 03 February 2016 (has links) Submitted by Automa??o e Estat?stica (sst@bczm.ufrn.br) on 2016-07-11T17:32:14Z No. of bitstreams: 1 JhonnataBezerraDeCarvalho_DISSERT.pdf: 3154941 bytes, checksum: aab207a1be81d327327698a3f8ff9a2d (MD5) / Approved for entry into archive by Arlan Eloi Leite Silva (eloihistoriador@yahoo.com.br) on 2016-07-15T21:46:07Z (GMT) No. of bitstreams: 1 JhonnataBezerraDeCarvalho_DISSERT.pdf: 3154941 bytes, checksum: aab207a1be81d327327698a3f8ff9a2d (MD5) / Made available in DSpace on 2016-07-15T21:46:07Z (GMT). No. of bitstreams: 1 JhonnataBezerraDeCarvalho_DISSERT.pdf: 3154941 bytes, checksum: aab207a1be81d327327698a3f8ff9a2d (MD5) Previous issue date: 2016-02-03 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior (CAPES) / O classificador M?quina de Suporte Vetorial, que vem do termo em ingl?s \textit{Support Vector Machine}, ? utilizado em diversos problemas em v?rias ?reas do conhecimento. Basicamente o m?todo utilizado nesse classificador ? encontrar o hiperplano que maximiza a dist?ncia entre os grupos, para aumentar o poder de generaliza??o do classificador. Neste trabalho, s?o tratados alguns problemas de classifica??o bin?ria com dados obtidos atrav?s da eletroencefalografia (EEG) e eletromiografia (EMG), utilizando a M?quina de Suporte Vetorial com algumas t?cnicas complementares, destacadas a seguir como: An?lise de Componentes Principais para a identifica??o de regi?es ativas do c?rebro, o m?todo do periodograma que ? obtido atrav?s da An?lise de Fourier, para ajudar a discriminar os grupos e a suaviza??o por M?dias M?veis Simples para a redu??o dos ru?dos existentes nos dados. Foram desenvolvidas duas fun??es no $software$ \textbf{R}, para a realiza??o das tarefas de treinamento e classifica??o. Al?m disso, foram propostos 2 sistemas de pesos e uma medida sumarizadora para auxiliar na decis?o do grupo pertencente. A aplica??o dessas t?cnicas, pesos e a medida sumarizadora no classificador, mostraram resultados bastantes satisfat?rios, em que os melhores resultados encontrados foram, uma taxa m?dia de acerto de 95,31\% para dados de est?mulos visuais, 100\% de classifica??o correta para dados de epilepsia e taxas de acerto de 91,22\% e 96,89\% para dados de movimentos de objetos para dois indiv?duos. / The classifier support vector machine is used in several problems in various areas of knowledge. Basically the method used in this classier is to end the hyperplane that maximizes the distance between the groups, to increase the generalization of the classifier. In this work, we treated some problems of binary classification of data obtained by electroencephalography (EEG) and electromyography (EMG) using Support Vector Machine with some complementary techniques, such as: Principal Component Analysis to identify the active regions of the brain, the periodogram method which is obtained by Fourier analysis to help discriminate between groups and Simple Moving Average to eliminate some of the existing noise in the data. It was developed two functions in the software R, for the realization of training tasks and classification. Also, it was proposed two weights systems and a summarized measure to help on deciding in classification of groups. The application of these techniques, weights and the summarized measure in the classier, showed quite satisfactory results, where the best results were an average rate of 95.31% to visual stimuli data, 100% of correct classification for epilepsy data and rates of 91.22% and 96.89% to object motion data for two subjects. Classificador bin?rio Eletroencefalografia Eletromiografia Periodograma An?lise de componentes principais Suaviza??o Support vector machine SVM
277	Classifica??o de imagens de ambientes coralinos: uma abordagem empregando uma combina??o de classificadores e m?quina de vetor de suporte Henriques, Ant?nio de P?dua de Miranda 08 August 2008 (has links) Made available in DSpace on 2014-12-17T14:54:48Z (GMT). No. of bitstreams: 1 AntonioPMH.pdf: 3481703 bytes, checksum: 60ec6cc8df48e68b6c13c12104d289d6 (MD5) Previous issue date: 2008-08-08 / Coordena??o de Aperfei?oamento de Pessoal de N?vel Superior / The use of the maps obtained from remote sensing orbital images submitted to digital processing became fundamental to optimize conservation and monitoring actions of the coral reefs. However, the accuracy reached in the mapping of submerged areas is limited by variation of the water column that degrades the signal received by the orbital sensor and introduces errors in the final result of the classification. The limited capacity of the traditional methods based on conventional statistical techniques to solve the problems related to the inter-classes took the search of alternative strategies in the area of the Computational Intelligence. In this work an ensemble classifiers was built based on the combination of Support Vector Machines and Minimum Distance Classifier with the objective of classifying remotely sensed images of coral reefs ecosystem. The system is composed by three stages, through which the progressive refinement of the classification process happens. The patterns that received an ambiguous classification in a certain stage of the process were revalued in the subsequent stage. The prediction non ambiguous for all the data happened through the reduction or elimination of the false positive. The images were classified into five bottom-types: deep water; under-water corals; inter-tidal corals; algal and sandy bottom. The highest overall accuracy (89%) was obtained from SVM with polynomial kernel. The accuracy of the classified image was compared through the use of error matrix to the results obtained by the application of other classification methods based on a single classifier (neural network and the k-means algorithm). In the final, the comparison of results achieved demonstrated the potential of the ensemble classifiers as a tool of classification of images from submerged areas subject to the noise caused by atmospheric effects and the water column / A utiliza??o de mapas, derivados da classifica??o de imagens de sensores remotos orbitais, tornou-se de fundamental import?ncia para viabilizar a??es de conserva??o e monitoramento de recifes de corais. Entretanto, a acur?cia atingida no mapeamento dessas ?reas ? limitada pelo efeito da varia??o da coluna d ?gua, que degrada o sinal recebido pelo sensor orbital e introduz erros no resultado final do processo de classifica??o. A limitada capacidade dos m?todos tradicionais, baseados em t?cnicas estat?sticas convencionais, para resolver este tipo de problema determinou a investiga??o de uma estrat?gia ligada ? ?rea da Intelig?ncia Computacional. Neste trabalho foi constru?do um conjunto de classificadores baseados em M?quinas de Vetor de Suporte e classificador de Dist?ncia M?nima, com o objetivo de classificar imagens de sensoriamento remoto de ecossistema de recifes de corais. O sistema ? composto por tr?s est?gios, atrav?s dos quais acontece o refinamento progressivo do processo de classifica??o. Os padr?es que receberam uma classifica??o amb?gua em uma determinada etapa do processo s?o reavaliados na etapa posterior. A predi??o n?o amb?gua para todos os dados aconteceu atrav?s da redu??o ou elimina??o dos falsos positivos. As imagens foram classificadas em cinco tipos de fundos: ?guas profundas, corais submersos, corais intermar?s, algas e fundo arenoso. A melhor acur?cia geral (89%) foi obtida quando foram utilizadas M?quinas de Vetor de Suporte com kernel polinomial. A acur?cia das imagens classificadas foi comparada, atrav?s da utiliza??o de matriz de erro, aos resultados alcan?ados pela aplica??o de outros m?todos de classifica??o baseados em um ?nico classificador (redes neurais e o algoritmo k-means). Ao final, a compara??o dos resultados alcan?ados demonstrou o potencial do conjunto de classificadores como instrumento de classifica??o de imagens de ?reas submersas, sujeitas aos ru?dos provocados pelos efeitos atmosf?ricos e da coluna d ?gua Recifes de corais Classifica??o de imagens Conjunto de classificadores M?quina de vetor de suporte Coral reefs Image classification Classifiers ensemble Support vector machine CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA
278	A Semantic Triplet Based Story Classifier January 2013 (has links) abstract: Text classification, in the artificial intelligence domain, is an activity in which text documents are automatically classified into predefined categories using machine learning techniques. An example of this is classifying uncategorized news articles into different predefined categories such as "Business", "Politics", "Education", "Technology" , etc. In this thesis, supervised machine learning approach is followed, in which a module is first trained with pre-classified training data and then class of test data is predicted. Good feature extraction is an important step in the machine learning approach and hence the main component of this text classifier is semantic triplet based features in addition to traditional features like standard keyword based features and statistical features based on shallow-parsing (such as density of POS tags and named entities). Triplet {Subject, Verb, Object} in a sentence is defined as a relation between subject and object, the relation being the predicate (verb). Triplet extraction process, is a 5 step process which takes input corpus as a web text document(s), each consisting of one or many paragraphs, from RSS feeds to lists of extremist website. Input corpus feeds into the "Pronoun Resolution" step, which uses an heuristic approach to identify the noun phrases referenced by the pronouns. The next step "SRL Parser" is a shallow semantic parser and converts the incoming pronoun resolved paragraphs into annotated predicate argument format. The output of SRL parser is processed by "Triplet Extractor" algorithm which forms the triplet in the form {Subject, Verb, Object}. Generalization and reduction of triplet features is the next step. Reduced feature representation reduces computing time, yields better discriminatory behavior and handles curse of dimensionality phenomena. For training and testing, a ten- fold cross validation approach is followed. In each round SVM classifier is trained with 90% of labeled (training) data and in the testing phase, classes of remaining 10% unlabeled (testing) data are predicted. Concluding, this paper proposes a model with semantic triplet based features for story classification. The effectiveness of the model is demonstrated against other traditional features used in the literature for text classification tasks. / Dissertation/Thesis / M.S. Computer Science 2013 Computer science Machine learning Natural Language Processing Ravi Karad SVM (Support Vector Machine) classifier Text classification
279	Segmentação, classificação e quantificação de bacilos de tuberculose em imagens de baciloscopia de campo claro através do emprego de uma nova técnica de classificação de pixels utilizando máquinas de vetores de suporte Xavier, Clahildek Matos 02 July 2012 (has links) Submitted by Geyciane Santos (geyciane_thamires@hotmail.com) on 2015-07-15T14:04:04Z No. of bitstreams: 1 Dissertação - Clahildek Matos Xavier.pdf: 23017599 bytes, checksum: f3e0230fd866c0a784966606404bb807 (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2015-07-15T18:37:45Z (GMT) No. of bitstreams: 1 Dissertação - Clahildek Matos Xavier.pdf: 23017599 bytes, checksum: f3e0230fd866c0a784966606404bb807 (MD5) / Approved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2015-07-15T18:47:26Z (GMT) No. of bitstreams: 1 Dissertação - Clahildek Matos Xavier.pdf: 23017599 bytes, checksum: f3e0230fd866c0a784966606404bb807 (MD5) / Made available in DSpace on 2015-07-15T18:47:26Z (GMT). No. of bitstreams: 1 Dissertação - Clahildek Matos Xavier.pdf: 23017599 bytes, checksum: f3e0230fd866c0a784966606404bb807 (MD5) Previous issue date: 2012-07-02 / Não Informada / Tuberculosis (TB) is a contagious disease caused by Mycobacterium tuberculosis that primarily affects the lungs and reaches over 8.8 million people worldwide. Although the number of cases of TB disease and deaths has fallen over the past years, this disease still remains a serious health problem in developing countries. Currently, as initial tests for the diagnosis of TB are used methods of smear bright field and fluorescence. The first is used mostly in developing countries, due to low cost, the second is the preferred method in developed countries to be more sensitive. Among the many challenges for the control of this disease is the development of a rapid, efficient and low cost for the diagnosis of tuberculosis. The process of diagnosis of smear-field course is time consuming, manual and error-prone, so that there is a high rate of false negatives. Various techniques for pattern recognition in the image smear bright field microscopy have been designed to recognize and count of the rods. This paper describes a new method for segmentation of tubercle bacilli in sputum bright field. The method proposed in this dissertation uses a classifier consisting of a support vector machine. The differential method is proposed in the variables selected for the input of the classifier. They were selected from four color spaces: RGB, HSI, YCbCr and Lab used to both individual characteristics such as subtractions of characteristics of the same color space and different color spaces. We investigated a total of 30 features. The best features were selected using the selection technique scalar features. With the proposed method was reached a sensitivity of 94%. However, further steps for noise reduction are required to minimize the classification errors. / A tuberculose (TB) é uma doença contagiosa causada pelo Mycobacterium tuberculosis que afeta, principalmente, os pulmões e atinge mais de 8,8 milhões de pessoas em todo o mundo. Embora o número de casos de doenças e mortes por TB tenham caído ao longo dos últimos anos, essa doença ainda continua sendo um grave problema de saúde nos países em desenvolvimento. Atualmente, como exames iniciais para o diagnóstico da TB são usados os métodos de baciloscopia de campo claro e baciloscopia de fluorescência. O primeiro é mais usado em países em desenvolvimento, devido ao baixo custo; o segundo é o método preferencial em países desenvolvidos por ser mais sensível. Entre os vários desafios para o controle dessa doença, está o desenvolvimento de um método rápido, eficiente e de baixo custo para o diagnóstico da TB. O processo de diagnóstico de baciloscopia de campo claro é demorado, manual e propenso a erros, fazendo com que haja uma alta taxa de falsos negativos. Várias técnicas de reconhecimento de padrão em imagens baciloscópicas de microscopia de campo claro têm sido desenvolvidas para o reconhecimento e contagem dos bacilos. Este trabalho descreve um novo método para segmentação de bacilos da tuberculose em baciloscopia de campo claro. O método proposto utiliza um classificador constituído por uma máquina de vetores de suporte. O diferencial do mesmo em relação a outros trabalhos está nas variáveis selecionadas para a entrada do classificador. Essas variáveis foram selecionadas a partir de quatro espaços de cor: RGB, HSI, YCbCr e Lab. Investigou-se tanto características individuais, como subtrações de características de um mesmo espaço de cor e de espaços de cores diferentes, num total de 30 características. As melhores características foram selecionadas utilizando-se a técnica de seleção escalar de características. Alcançou-se uma sensibilidade de 94%. No entanto, novas etapas para a redução de ruído são necessárias para minimizar os erros de classificação. Segmentação de imagem Microscopia convencional Tuberculose - Métodos de baciloscopia Seleção escalar de características Classificador de pixels Máquina de vetores de suporte Image segmentation Tuberculosis Pixel classifier Support vector machine ENGENHARIAS: ENGENHARIA ELÉTRICA
280	Classificação de sinais de eletroencefalograma usando máquinas de vetores suporte Chagas, Sandro Luiz das 27 August 2009 (has links) Made available in DSpace on 2016-03-15T19:38:14Z (GMT). No. of bitstreams: 1 Sandro Luiz das Chagas.pdf: 1694587 bytes, checksum: d10c7a5a95b65289731cab95f9b3478a (MD5) Previous issue date: 2009-08-27 / Electroencephalogram (EEG) is a clinical method widely used to study brain function and neurological disorders. The EEG is a temporal data series which records the electrical activity of the brain. The EEG monitoring systems create a huge amount of data; with this fact a visual analysis of the EEG is not feasible. Because of this, there is a strong demand for computational methods able to analyze automatically the EEG records and extract useful information to support the diagnostics. Herewith, it is necessary to design a tool to extract the relevant features within the EEG record and to classify the EEG based on these features. Calculation of statistics over wavelet coefficients are being used successfully to extract features from many kinds of temporal data series, including EEG signals. Support Vector Machines (SVM) are machine learning techniques with high generalization ability, and they have been successfully used in classification problems by several researches. This dissertation makes an analysis of the influence of feature vectors based on wavelet coefficients in the classification of EEG signal using different implementations of SVMs. / O eletroencefalograma (EEG) é um exame médico largamente utilizado no estudo da função cerebral e de distúrbios neurológicos. O EEG é uma série temporal que contém os registros de atividade elétrica do cérebro. Um grande volume de dados é gerado pelos sistemas de monitoração de EEG, o que faz com que a análise visual completa destes dados se torne inviável na prática. Com isso, surge uma grande demanda por métodos computacionais capazes de extrair, de forma automática, informação útil para a realização de diagnósticos. Para atender essa demanda, é necessária uma forma de extrair de um sinal de EEG as características relevantes para um diagnóstico e também uma forma de classificar o EEG em função destas características. O cálculo de estatísticas sobre coeficientes wavelet vem sendo empregado com sucesso na extração de características de diversos tipos de séries temporais, inclusive EEG. As máquinas de vetores de suporte (SVM do inglês Support Vector Machines) constituem uma técnica de aprendizado de máquina que possui alta capacidade de generalização e têm sido empregadas com sucesso em problemas de classificação por diversos pesquisadores. Nessa dissertação é feita uma análise do impacto da utilização de vetores de características baseados em coeficientes wavelet na classificação de EEG utilizando diferentes implementações de SVM. máquina de vetor de suporte (SVM) eletroencefalograma (EEG) séries temporais extração de características wavelet support vector machine (SVM) eletroencefalogram (EEG) temporal data series feature extraction wavelet CNPQ::ENGENHARIAS::ENGENHARIA ELETRICA

Search results