421 |
A Semantic Triplet Based Story ClassifierJanuary 2013 (has links)
abstract: Text classification, in the artificial intelligence domain, is an activity in which text documents are automatically classified into predefined categories using machine learning techniques. An example of this is classifying uncategorized news articles into different predefined categories such as "Business", "Politics", "Education", "Technology" , etc. In this thesis, supervised machine learning approach is followed, in which a module is first trained with pre-classified training data and then class of test data is predicted. Good feature extraction is an important step in the machine learning approach and hence the main component of this text classifier is semantic triplet based features in addition to traditional features like standard keyword based features and statistical features based on shallow-parsing (such as density of POS tags and named entities). Triplet {Subject, Verb, Object} in a sentence is defined as a relation between subject and object, the relation being the predicate (verb). Triplet extraction process, is a 5 step process which takes input corpus as a web text document(s), each consisting of one or many paragraphs, from RSS feeds to lists of extremist website. Input corpus feeds into the "Pronoun Resolution" step, which uses an heuristic approach to identify the noun phrases referenced by the pronouns. The next step "SRL Parser" is a shallow semantic parser and converts the incoming pronoun resolved paragraphs into annotated predicate argument format. The output of SRL parser is processed by "Triplet Extractor" algorithm which forms the triplet in the form {Subject, Verb, Object}. Generalization and reduction of triplet features is the next step. Reduced feature representation reduces computing time, yields better discriminatory behavior and handles curse of dimensionality phenomena. For training and testing, a ten- fold cross validation approach is followed. In each round SVM classifier is trained with 90% of labeled (training) data and in the testing phase, classes of remaining 10% unlabeled (testing) data are predicted. Concluding, this paper proposes a model with semantic triplet based features for story classification. The effectiveness of the model is demonstrated against other traditional features used in the literature for text classification tasks. / Dissertation/Thesis / M.S. Computer Science 2013
|
422 |
DIAGNÓSTICO DE DIABETES TIPO II POR CODIFICAÇÃO EFICIENTE E MÁQUINAS DE VETOR DE SUPORTE / DIAGNOSIS OF DIABETES TYPE II BY EFFICIENT CODING AND VECTOR MACHINE SUPPORTRibeiro, Aurea Celeste da Costa 30 June 2009 (has links)
Made available in DSpace on 2016-08-17T14:53:05Z (GMT). No. of bitstreams: 1
Aurea Celeste da Costa Ribeiro.pdf: 590401 bytes, checksum: 1ec80bb8ac1a3e674ff49966fa9b383c (MD5)
Previous issue date: 2009-06-30 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior / Diabetes is a disease caused by the pancreas failing to produce insulin. It is
incurable and its treatment is based on a diet, exercise and drugs. The costs for
diagnosis and human resources for it have become high and ine±cient. Computer-
aided design (CAD) systems are essential to solve this problem.
Our study proposes a CAD system based on the one-class support vector machine
(SVM) method and the eficient coding with independent component analysis (ICA)
to classify a patient's data set in diabetics or non-diabetics.
First, the classification tests were done using both non-invasive and invasive
characteristics of the disease. Then, we made one test without the invasive
characteristics: plasma glucose concentration and 2-Hour serum insulin (mu U/ml),
which use blood samples. We have obtained an accuracy of 99.84% and 99.28%,
respectively. Other tests were made without the invasive characteristics, also
excluding one non-invasive characteristic at a time, to observe the influence of each
one in the final results. / Diabetes é uma doença causada pela falência do pâncreas em produzir insulina,
é incurável e seu tratamento é baseado em dietas, exercícios e remédios. Os custos
com o tratamento, diagnóstico na população e combate da doença tornam-se cada
vez mais altos. Sistemas de auxíio ao diagnóstico da doença são uma das soluções
para ajudar na diminuição dos custos com a doença.
Nosso método propõe um sistema de auxílio de diagnóstico baseado nas máquinas
de vetor de suporte para uma classe e na codificação eficiente através da análise de
componentes independentes para classificar uma base de dados de pacientes em
diabéticos e não-diabéticos.
Primeiramente, foram feitos testes de classificação com as características não-
invasivas e invasivas da base de dados juntas. Em seguida, fizemos um teste sem
as características invasivas da base de dados, que são glicose e insulina em jejum,
que são feitas com a coleta sanguínea. Obteve-se uma taxa de acurácia de 99,84% e
99,28%, respectivamente. Outros testes foram feitos sem as características invasivas,
tirando uma característica não-invasiva por vez, com o fim de observar a influência
de cada uma no resultado final.
|
423 |
Využití metod UI v algoritmickém obchodování / AI techniques in algorhitmic tradingŠmejkal, Oldřich January 2015 (has links)
Diploma thesis is focused on research and description of current state of machine learning field, focusing on methods that can be used for prediction and classification of time series, which could be then applied in the algorithmic trading field. Reading of theoretical section should explain basic principles of financial markets, algorithmic trading and machine learning also to reader, which was previously familiar with the subject only very thoroughly. Main objective of application part is to choose appropriate methods and procedures, which match current state of art techniques in machine learning field. Next step is to apply it to historical price data. Result of application of selected methods is determination of their success at out of sample data that was not used during model calibration. Success of prediction was evaluated by accuracy metric along with Sharpe ratio of basic trading strategy that is based on model predictions. Secondary outcome of this work is to explore possibilities and test usability of technologies used in application part. Specifically is tested and used SciPy environment, that combines Python with packages and tools designed for data analysis, statistics and machine learning.
|
424 |
Mesure de l'attention visuo-spatiale dans l'espace et le temps par les potentiels reliés aux événements (PRÉ)Pelland-Goulet, Pénélope 06 1900 (has links)
Les potentiels reliés aux événements (PRÉ) sont très couramment utilisés comme méthode de mesure de l’attention visuelle. Certaines composantes PRÉ comme la N2pc et la P3 sont largement considérées comme marqueurs du déploiement de l’attention. Afin d’investiguer s’il est possible de déterminer la localisation sur laquelle l’attention est dirigée ou encore la présence ou non de l’attention à une localisation donnée, une tâche d’indiçage spatial a été utilisée. L’indice indiquait l’une de quatre localisations sur laquelle les participants devaient diriger leur attention. L’indice spatial utilisé était de nature exclusivement symbolique, impliquant que l’attention devait être déplacée de façon volontaire. L’analyse des signaux ÉEG captés alors que les participants réalisaient la tâche a été effectuée en faisant usage d’une technique d’apprentissage machine. Un classificateur de type SVM (Support Vector Machine) a ainsi été utilisé afin de prédire la présence ou l’absence d’attention à une localisation en utilisant le signal ÉEG associé aux cibles et aux distracteurs. Un taux de précision de 75% (p < 0,001) a été obtenu lors de cette classification, le niveau du hasard se trouvant à 50%. Un classificateur de type DSVM (SVM à dendrogramme) a été utilisé afin de prédire le locus précis de l’attention en utilisant le signal ÉEG relié aux cibles uniquement. Dans ce problème de classification, un taux de prédiction exacte de 51,7% (p < 0,001) a été obtenu, le niveau du hasard étant de 25%. Les résultats indiquent qu’il est possible de distinguer le locus attentionnel à partir des PRÉ dans un espace de +/- 0,4 degrés d’angle visuel et ce, avec des taux de précision dépassant largement le niveau du hasard. / Event related potentials (ERP) are commonly used as a method of measuring visual attention. ERP components such as N2pc and P3 are largely considered as markers of attention deployment. In order to investigate the possibility of predicting the locus and the presence or absence of attention, a spatial cueing task was used. A cue indicated one of the four locations on which subjects had to direct their attention. The spatial cue was exclusively symbolic, implying that attention had to be oriented voluntarily. The analysis of the EEG signal which was measured as subjects carried out the task was performed using machine learning. An SVM (Support Vector Machine) classifier was used to predict the presence or absence of attention at one location, using the EEG signal associated with targets and distractors. A decoding accuracy of 75% (p < 0,001) was achieved for this classification, with a chance level of 50%. A DSVM (Dendrogram SVM) was used to predict the precise locus of attention using the EEG signal linked to targets only. In this classification problem, a decoding accuracy of 51,7% (p < 0,001) was achieved, with a chance level of 25%. These results suggest that it is possible to distinguish the locus of attention from ERPs in a +/- 0,4 degrees of visual angle space with decoding accuracies considerably above chance.
|
425 |
A comparison of three brain atlases for MCI prediction / 軽度認知障害からアルツハイマー病への移行予測精度における脳アトラス選択の影響Ota, Kenichi 23 March 2015 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(医学) / 甲第18872号 / 医博第3983号 / 新制||医||1008(附属図書館) / 31823 / 京都大学大学院医学研究科医学専攻 / (主査)教授 河野 憲二, 教授 古川 壽亮, 教授 髙橋 良輔 / 学位規則第4条第1項該当 / Doctor of Medical Science / Kyoto University / DGAM
|
426 |
CONTENT UNDERSTANDING FOR IMAGING SYSTEMS: PAGE CLASSIFICATION, FADING DETECTION, EMOTION RECOGNITION, AND SALIENCY BASED IMAGE QUALITY ASSESSMENT AND CROPPINGShaoyuan Xu (9116033) 12 October 2021 (has links)
<div>This thesis consists of four sections which are related with four research projects.</div><div><br></div><div>The first section is about Page Classification. In this section, we extend our previous approach which could classify 3 classes of pages: Text, Picture and Mixed, to 5 classes which are: Text, Picture, Mixed, Receipt and Highlight. We first design new features to define those two new classes and then use DAG-SVM to classify those 5 classes of images. Based on the results, our algorithm performs well and is able to classify 5 types of pages.</div><div><br></div><div>The second section is about Fading Detection. In this section, we develop an algorithm that can automatically detect fading for both text and non-text region. For text region, we first do global alignment and then perform local alignment. After that, we create a 3D color node system, assign each connected component to a color node and get the color difference between raster page connected component and scanned page connected. For non-text region, after global alignment, we divide the page into "super pixels" and get the color difference between raster super pixels and testing super pixels. Compared with the traditional method that uses a diagnostic page, our method is more efficient and effective.</div><div><br></div><div>The third section is about CNN Based Emotion Recognition. In this section, we build our own emotion recognition classification and regression system from scratch. It includes data set collection, data preprocessing, model training and testing. We extend the model to real-time video application and it performs accurately and smoothly. We also try another approach of solving the emotion recognition problem using Facial Action Unit detection. By extracting Facial Land Mark features and adopting SVM training framework, the Facial Action Unit approach achieves comparable accuracy to the CNN based approach.</div><div><br></div><div>The forth section is about Saliency Based Image Quality Assessment and Cropping. In this section, we propose a method of doing image quality assessment and recomposition with the help of image saliency information. Saliency is the remarkable region of an image that attracts people's attention easily and naturally. By showing everyday examples as well as our experimental results, we demonstrate the fact that, utilizing the saliency information will be beneficial for both tasks.</div>
|
427 |
Klasifikace signálů denní aktivity nasnímaných zařízením Faros / Classification of free living data sensed with FarosŠalamoun, Jan January 2018 (has links)
Topic of this master thesis is classification of free living data sensed with Faros. Faros is small compatible device which measure ECG and 3-axes accelerometric data. The first part of master thesis is find out how automatically measure free living activities by accelerometer and ECG. In next part was measured data of 8 activities from 10 probands. Automatic algorithms are made for this data in Matlab. This algorithms were used for this datasets and compare with manually recorded references. In the end of master thesis data were statistically evaluated.
|
428 |
Řízení a měření sportovních drilů hlasem/zvuky / Controlling and Measuring Sport Drills by Voice/SoundOdehnal, Jiří January 2019 (has links)
This master's thesis deals with the design and development of mobile aplication for Android platform. The aim of the work is to implement a simple and user-friendly user interface that would support and assist the user in trainning and sport exercises. The thesis also include implementation of sound detection to support during exercises and voice instruction by application. In practice the application should help in making training exercises more comfortable without the user being forced to keep mobile device in hand.
|
429 |
Analýza sociálních sítí využitím metod rozpoznání vzoru / Social Network Analysis using methods of pattern recognitionKrižan, Viliam January 2015 (has links)
Diplomová práca sa zaoberá rozpoznávaním emócií z textu v sociálnych sieťach. Práca popisuje súčasné metódy extrakcie príznakov, používané lexikóny, korpusy a klasifikátory. Emócie boli rozpoznávané na základe klasifikátoru, netrénovaného na anotovaných dátach z mikroblogovacej siete Twitter. Výhodou použitia služby Twitter, bolo geografické vymedzenie dát, ktoré umožňuje sledovanie zmien emócií populácie v rôznych mestách. Prvým prístupom klasifikácie bolo vytvorenie Baseline algoritmu, ktorý používal jednoduchý lexikón. Pre zlepšenie klasifikácie sme v druhom bode použili komplexnejší SVM klasifikátor. SVM klasifikátory, extrakcie a selekcie príznakov boli použité z dostupnej Python knižnice Scikit. Dáta pre natrénovanie klasifikátoru boli zhromažďované z oblasti USA, a to s pomocou vytvorenej aplikácie. Klasifikátor bol natrénovaný na dátach, označených pri ich zhromažďovaní - bez manuálnej anotácie. Boli použité dve rôzne implantácie SVM klasifikátorov. Výsledné klasifikované emócie, v rôznych mestách a dňoch, boli zobrazené v podobe farebných značiek na mape.
|
430 |
Klasifikátory proudových otisků / Classifiers of power patternsZapletal, Ondřej January 2014 (has links)
Over the last several years side-channel analysis has emerged as a major threat to securing sensitive information in cryptographic devices. Several side-channels have been discovered and used to break implementations of all major cryptographic algorithms (AES, DES, RSA). This thesis is focused on power analysis attacks. A variety of power analysis methods has been developed to perform these attacks. These methods include simple power analysis (SPA), differential power analysis (DPA), template attacks, etc. This work provides comprehensive survey of mentioned methods and also investigates the application of a machine learning techniques in power analysis. The considered learning techniques are neural networks and support vector machines. The final part of this thesis is dedicated to implemenation of the attack against protected software AES implementation which is used in the DPA Contest.
|
Page generated in 0.0323 seconds