• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 37
  • 12
  • 5
  • 3
  • 3
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 73
  • 73
  • 73
  • 26
  • 23
  • 20
  • 16
  • 16
  • 16
  • 14
  • 13
  • 11
  • 10
  • 10
  • 10
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
61

Classification of uncertain data in the framework of belief functions : nearest-neighbor-based and rule-based approaches / Classification des données incertaines dans le cadre des fonctions de croyance : la métode des k plus proches voisins et la méthode à base de règles

Jiao, Lianmeng 26 October 2015 (has links)
Dans de nombreux problèmes de classification, les données sont intrinsèquement incertaines. Les données d’apprentissage disponibles peuvent être imprécises, incomplètes, ou même peu fiables. En outre, des connaissances spécialisées partielles qui caractérisent le problème de classification peuvent également être disponibles. Ces différents types d’incertitude posent de grands défis pour la conception de classifieurs. La théorie des fonctions de croyance fournit un cadre rigoureux et élégant pour la représentation et la combinaison d’une grande variété d’informations incertaines. Dans cette thèse, nous utilisons cette théorie pour résoudre les problèmes de classification des données incertaines sur la base de deux approches courantes, à savoir, la méthode des k plus proches voisins (kNN) et la méthode à base de règles.Pour la méthode kNN, une préoccupation est que les données d’apprentissage imprécises dans les régions où les classes de chevauchent peuvent affecter ses performances de manière importante. Une méthode d’édition a été développée dans le cadre de la théorie des fonctions de croyance pour modéliser l’information imprécise apportée par les échantillons dans les régions qui se chevauchent. Une autre considération est que, parfois, seul un ensemble de données d’apprentissage incomplet est disponible, auquel cas les performances de la méthode kNN se dégradent considérablement. Motivé par ce problème, nous avons développé une méthode de fusion efficace pour combiner un ensemble de classifieurs kNN couplés utilisant des métriques couplées apprises localement. Pour la méthode à base de règles, afin d’améliorer sa performance dans les applications complexes, nous étendons la méthode traditionnelle dans le cadre des fonctions de croyance. Nous développons un système de classification fondé sur des règles de croyance pour traiter des informations incertains dans les problèmes de classification complexes. En outre, dans certaines applications, en plus de données d’apprentissage, des connaissances expertes peuvent également être disponibles. Nous avons donc développé un système de classification hybride fondé sur des règles de croyance permettant d’utiliser ces deux types d’information pour la classification. / In many classification problems, data are inherently uncertain. The available training data might be imprecise, incomplete, even unreliable. Besides, partial expert knowledge characterizing the classification problem may also be available. These different types of uncertainty bring great challenges to classifier design. The theory of belief functions provides a well-founded and elegant framework to represent and combine a large variety of uncertain information. In this thesis, we use this theory to address the uncertain data classification problems based on two popular approaches, i.e., the k-nearest neighbor rule (kNN) andrule-based classification systems. For the kNN rule, one concern is that the imprecise training data in class over lapping regions may greatly affect its performance. An evidential editing version of the kNNrule was developed based on the theory of belief functions in order to well model the imprecise information for those samples in over lapping regions. Another consideration is that, sometimes, only an incomplete training data set is available, in which case the ideal behaviors of the kNN rule degrade dramatically. Motivated by this problem, we designedan evidential fusion scheme for combining a group of pairwise kNN classifiers developed based on locally learned pairwise distance metrics.For rule-based classification systems, in order to improving their performance in complex applications, we extended the traditional fuzzy rule-based classification system in the framework of belief functions and develop a belief rule-based classification system to address uncertain information in complex classification problems. Further, considering that in some applications, apart from training data collected by sensors, partial expert knowledge can also be available, a hybrid belief rule-based classification system was developed to make use of these two types of information jointly for classification.
62

Bank Customer Churn Prediction : A comparison between classification and evaluation methods

Tandan, Isabelle, Goteman, Erika January 2020 (has links)
This study aims to assess which supervised statistical learning method; random forest, logistic regression or K-nearest neighbor, that is the best at predicting banks customer churn. Additionally, the study evaluates which cross-validation set approach; k-Fold cross-validation or leave-one-out cross-validation that yields the most reliable results. Predicting customer churn has increased in popularity since new technology, regulation and changed demand has led to an increase in competition for banks. Thus, with greater reason, banks acknowledge the importance of maintaining their customer base.   The findings of this study are that unrestricted random forest model estimated using k-Fold is to prefer out of performance measurements, computational efficiency and a theoretical point of view. Albeit, k-Fold cross-validation and leave-one-out cross-validation yield similar results, k-Fold cross-validation is to prefer due to computational advantages.   For future research, methods that generate models with both good interpretability and high predictability would be beneficial. In order to combine the knowledge of which customers end their engagement as well as understanding why. Moreover, interesting future research would be to analyze at which dataset size leave-one-out cross-validation and k-Fold cross-validation yield the same results.
63

Detekce fibrilace síní v krátkodobých EKG záznamech / Detection of atrial fibrillation in short-term ECG

Ambrožová, Monika January 2019 (has links)
Atrial fibrillation is diagnosed in 1-2% of the population, in next decades, it expects a significant increase in the number of patients with this arrhythmia in connection with the aging of the population and the higher incidence of some diseases that are considered as risk factors of atrial fibrillation. The aim of this work is to describe the problem of atrial fibrillation and the methods that allow its detection in the ECG record. In the first part of work there is a theory dealing with cardiac physiology and atrial fibrillation. There is also basic descreption of the detection of atrial fibrillation. In the practical part of work, there is described software for detection of atrial fibrillation, which is provided by BTL company. Furthermore, an atrial fibrillation detector is designed. Several parameters were selected to detect the variation of RR intervals. These are the parameters of the standard deviation, coefficient of skewness and kurtosis, coefficient of variation, root mean square of the successive differences, normalized absolute deviation, normalized absolute difference, median absolute deviation and entropy. Three different classification models were used: support vector machine (SVM), k-nearest neighbor (KNN) and discriminant analysis classification. The SVM classification model achieves the best results. Results of success indicators (sensitivity: 67.1%; specificity: 97.0%; F-measure: 66.8%; accuracy: 92.9%).
64

Adaptivní klient pro sociální síť Twitter / Adaptive Client for Twitter Social Network

Guňka, Jiří January 2011 (has links)
The goal of this term project is create user friendly client of Twitter. They may use methods of machine learning as naive bayes classifier to mentions new interests tweets. For visualissation this tweets will be use hyperbolic trees and some others methods.
65

Αναγνώριση βασικών κινήσεων του χεριού με χρήση ηλεκτρομυογραφήματος / Recognition of basic hand movements using electromyography

Σαψάνης, Χρήστος 13 October 2013 (has links)
Ο στόχος αυτής της εργασίας ήταν η αναγνώριση έξι βασικών κινήσεων του χεριού με χρήση δύο συστημάτων. Όντας θέμα διεπιστημονικού επιπέδου έγινε μελέτη της ανατομίας των μυών του πήχη, των βιοσημάτων, της μεθόδου της ηλεκτρομυογραφίας (ΗΜΓ) και μεθόδων αναγνώρισης προτύπων. Παράλληλα, το σήμα περιείχε αρκετό θόρυβο και έπρεπε να αναλυθεί, με χρήση του EMD, να εξαχθούν χαρακτηριστικά αλλά και να μειωθεί η διαστασιμότητά τους, με χρήση των RELIEF και PCA, για βελτίωση του ποσοστού επιτυχίας ταξινόμησης. Στο πρώτο μέρος γίνεται χρήση συστήματος ΗΜΓ της Delsys αρχικά σε ένα άτομο και στη συνέχεια σε έξι άτομα με το κατά μέσο όρο επιτυχημένης ταξινόμησης, για τις έξι αυτές κινήσεις, να αγγίζει ποσοστά άνω του 80%. Το δεύτερο μέρος περιλαμβάνει την κατασκευή αυτόνομου συστήματος ΗΜΓ με χρήση του Arduino μικροελεγκτή, αισθητήρων ΗΜΓ και ηλεκτροδίων, τα οποία είναι τοποθετημένα σε ένα ελαστικό γάντι. Τα αποτελέσματα ταξινόμησης σε αυτή την περίπτωση αγγίζουν το 75%. / The aim of this work was to identify six basic movements of the hand using two systems. Being an interdisciplinary topic, there has been conducted studying in the anatomy of forearm muscles, biosignals, the method of electromyography (EMG) and methods of pattern recognition. Moreover, the signal contained enough noise and had to be analyzed, using EMD, to extract features and to reduce its dimensionality, using RELIEF and PCA, to improve the success rate of classification. The first part uses an EMG system of Delsys initially for an individual and then for six people with the average successful classification, for these six movements at rates of over 80%. The second part involves the construction of an autonomous system EMG using an Arduino microcontroller, EMG sensors and electrodes, which are arranged in an elastic glove. Classification results in this case reached 75% of success.
66

Detekce fibrilace síní v EKG / ECG based atrial fibrillation detection

Prokopová, Ivona January 2020 (has links)
Atrial fibrillation is one of the most common cardiac rhythm disorders characterized by ever-increasing prevalence and incidence in the Czech Republic and abroad. The incidence of atrial fibrillation is reported at 2-4 % of the population, but due to the often asymptomatic course, the real prevalence is even higher. The aim of this work is to design an algorithm for automatic detection of atrial fibrillation in the ECG record. In the practical part of this work, an algorithm for the detection of atrial fibrillation is proposed. For the detection itself, the k-nearest neighbor method, the support vector method and the multilayer neural network were used to classify ECG signals using features indicating the variability of RR intervals and the presence of the P wave in the ECG recordings. The best detection was achieved by a model using a multilayer neural network classification with two hidden layers. Results of success indicators: Sensitivity 91.23 %, Specificity 99.20 %, PPV 91.23 %, F-measure 91.23 % and Accuracy 98.53 %.
67

Neue Indexingverfahren für die Ähnlichkeitssuche in metrischen Räumen über großen Datenmengen

Guhlemann, Steffen 08 April 2016 (has links)
Ein zunehmend wichtiges Thema in der Informatik ist der Umgang mit Ähnlichkeit in einer großen Anzahl unterschiedlicher Domänen. Derzeit existiert keine universell verwendbare Infrastruktur für die Ähnlichkeitssuche in allgemeinen metrischen Räumen. Ziel der Arbeit ist es, die Grundlage für eine derartige Infrastruktur zu legen, die in klassische Datenbankmanagementsysteme integriert werden könnte. Im Rahmen einer Analyse des State of the Art wird der M-Baum als am besten geeignete Basisstruktur identifiziert. Dieser wird anschließend zum EM-Baum erweitert, wobei strukturelle Kompatibilität mit dem M-Baum erhalten wird. Die Abfragealgorithmen werden im Hinblick auf eine Minimierung notwendiger Distanzberechnungen optimiert. Aufbauend auf einer mathematischen Analyse der Beziehung zwischen Baumstruktur und Abfrageaufwand werden Freiheitsgrade in Baumänderungsalgorithmen genutzt, um Bäume so zu konstruieren, dass Ähnlichkeitsanfragen mit einer minimalen Anzahl an Anfrageoperationen beantwortet werden können. / A topic of growing importance in computer science is the handling of similarity in multiple heterogenous domains. Currently there is no common infrastructure to support this for the general metric space. The goal of this work is lay the foundation for such an infrastructure, which could be integrated into classical data base management systems. After some analysis of the state of the art the M-Tree is identified as most suitable base and enhanced in multiple ways to the EM-Tree retaining structural compatibility. The query algorithms are optimized to reduce the number of necessary distance calculations. On the basis of a mathematical analysis of the relation between the tree structure and the query performance degrees of freedom in the tree edit algorithms are used to build trees optimized for answering similarity queries using a minimal number of distance calculations.
68

Topics in random matrices and statistical machine learning / ランダム行列と統計的機械学習について

Sushma, Kumari 25 September 2018 (has links)
京都大学 / 0048 / 新制・課程博士 / 博士(理学) / 甲第21327号 / 理博第4423号 / 新制||理||1635(附属図書館) / 京都大学大学院理学研究科数学・数理解析専攻 / (主査)准教授 COLLINS,Benoit Vincent Pierre, 教授 泉 正己, 教授 日野 正訓 / 学位規則第4条第1項該当 / Doctor of Science / Kyoto University / DFAM
69

Classification of Radar Emitters Based on Pulse Repetition Interval using Machine Learning

Svensson, André January 2022 (has links)
In electronic warfare, one of the key technologies is radar. Radar is used to detect and identify unknown aerial, nautical or land-based objects. An attribute of of a pulsed radar signal is the Pulse Repetition Interval (PRI) which is the time interval between pulses in a pulse train. In a passive radar receiver system, the PRI can be used to recognize the emitter system. Correct classification of emitter systems is a crucial part of Electronic Support Measures (ESM) and Radar Warning Receivers (RWR) in order to deploy appropriate measures depending on the emitter system. Inaccurate predictions of emitter systems can have lethal consequences and variables such as time and confidence in the predictions are essential for an effective predictive method. Due to the classified nature of military systems and techniques, there are no industry standard systems or techniques that perform quick and accurate classifications of emitter systems based on PRI. Therefore, methods that allows for fast and accurate predictions based on PRI is highly desirable and worthy of research. This thesis explores and compares the capabilities of two machine learning methods for the task of classifying emitters based on received PRI. The first method is an attention based model which performs well throughout all levels of realistic noise and is quick to learn and even quicker to give accurate predictions. The second method is a K-Nearest Neighbor (KNN) implementation that, while performing well for noise-free PRI, finds its performance degrading as the amount of noise increases. An additional outcome of this thesis is the development of a system to generate samples in an automated fashion. The attention based model performs well, achieving a macro avarage F1-score of 63% in the 59-class recognition task whereas the performance of the KNN is lower, achieving a macro avarage F1-score of 43%. Future research could be conducted with the purpose of designing a better attention based model for producing higher and more confident predictions and designing algorithms to reduce the time complexity of the KNN implementation. / En av de viktigaste teknikerna inom telektrig är radarn. Radar används för att upptäcka och identifiera okända, luftburna, sjögående eller landbaserade förmål. En komponent av radar är Pulsrepetitionsinterval (Pulse Repetition Intervall, PRI) som beskrivs som tidsintervallet mellan två inkommande pulser. I ett radarvarnar system (Radar Warning Receiver, RWR) kan PRI användas för att identifiera radarsystem. Korrekt identifiering av radarsystem är en viktig uppgift för elektroniska understödsmedel (Electronic Support Measures, ESM) med syfte att tillsätta lämpliga medel beroende på radarsystemet i fråga. Icke tillförlitlig identifiering av radarsystem kan ha dödliga konsekvenser och variabler som tid och säkerhet i identifieringen är avgörande för ett effektivt system. Då dokumentation och specifikationer för militära system i regel är hemligstämplade är det svårt att utröna någon typ av industristandard för att utföra snabb och säker klassificering av radarsystem baserat på PRI. Därför är det av stort intresse detta område och möjligheterna för sådana lösningar utforskas. Detta examensarbete utforskar och jämför förmågorna hos två maskininlärningsmetoder i avseende att korrekt identifiera radarsändare baserat på genererat PRI. Den första metoden är ett djupt neuralt nätverk som använder sig av tekniken ”attention”. Det djupa nätverket presterar bra för alla brusnivåer och lär sig snabbt att känna igen attributen hos PRI som kännetecknar vilken radarsändare och som efter träning dessutom är snabb på att korrekt identifiera PRI. Den andra metoden är en K-Nearest Neighbor implementation som förvisso presterar bra på icke brusig data men vars förmåga försämras allt eftersom brusnivåerna ökar. Ett ytterligare resultat av arbetet är utvecklingen och implementationen av en metod för att specificera PRI och sedan generera PRI efter specifikation. Attention modellen genererar bra prediktioner för data bestående av 59 klasser, med ett F1-score snitt om 63% medan KNN-implementationen för samma uppgift har en lägre träffsäkerhet med ett F1-score snitt om 43%. Vidare forskning kan innefatta utökad utveckling av det djupa, neurala nätverket i syfte att förbättra dess förmåga för identifiering och metoder för att minimera tidsåtgången för KNN implementationen.
70

Predicting PV self-consumption in villas with machine learning

GALLI, FABIAN January 2021 (has links)
In Sweden, there is a strong and growing interest in solar power. In recent years, photovoltaic (PV) system installations have increased dramatically and a large part are distributed grid connected PV systems i.e. rooftop installations. Currently the electricity export rate is significantly lower than the import rate which has made the amount of self-consumed PV electricity a critical factor when assessing the system profitability. Self-consumption (SC) is calculated using hourly or sub-hourly timesteps and is highly dependent on the solar patterns of the location of interest, the PV system configuration and the building load. As this varies for all potential installations it is difficult to make estimations without having historical data of both load and local irradiance, which is often hard to acquire or not available. A method to predict SC using commonly available information at the planning phase is therefore preferred.  There is a scarcity of documented SC data and only a few reports treating the subject of mapping or predicting SC. Therefore, this thesis is investigating the possibility of utilizing machine learning to create models able to predict the SC using the inputs: Annual load, annual PV production, tilt angle and azimuth angle of the modules, and the latitude. With the programming language Python, seven models are created using regression techniques, using real load data and simulated PV data from the south of Sweden, and evaluated using coefficient of determination (R2) and mean absolute error (MAE). The techniques are Linear Regression, Polynomial regression, Ridge Regression, Lasso regression, K-Nearest Neighbors (kNN), Random Forest, Multi-Layer Perceptron (MLP), as well as the only other SC prediction model found in the literature. A parametric analysis of the models is conducted, removing one variable at a time to assess the model’s dependence on each variable.  The results are promising, with five out of eight models achieving an R2 value above 0.9 and can be considered good for predicting SC. The best performing model, Random Forest, has an R2 of 0.985 and a MAE of 0.0148. The parametric analysis also shows that while more input data is helpful, using only annual load and PV production is sufficient to make good predictions. This can only be stated for model performance for the southern region of Sweden, however, and are not applicable to areas outside the latitudes or country tested. / I Sverige finns ett starkt och växande intresse för solenergi. De senaste åren har antalet solcellsanläggningar ökat dramatiskt och en stor del är distribuerade nätanslutna solcellssystem, dvs takinstallationer. För närvarande är elexportpriset betydligt lägre än importpriset, vilket har gjort mängden egenanvänd solel till en kritisk faktor vid bedömningen av systemets lönsamhet. Egenanvändning (EA) beräknas med tidssteg upp till en timmes längd och är i hög grad beroende av solstrålningsmönstret för platsen av intresse, PV-systemkonfigurationen och byggnadens energibehov. Eftersom detta varierar för alla potentiella installationer är det svårt att göra uppskattningar utan att ha historiska data om både energibehov och lokal solstrålning, vilket ofta inte är tillgängligt. En metod för att förutsäga EA med allmän tillgänglig information är därför att föredra.  Det finns en brist på dokumenterad EA-data och endast ett fåtal rapporter som behandlar kartläggning och prediktion av EA. I denna uppsats undersöks möjligheten att använda maskininlärning för att skapa modeller som kan förutsäga EA. De variabler som ingår är årlig energiförbrukning, årlig solcellsproduktion, lutningsvinkel och azimutvinkel för modulerna och latitud. Med programmeringsspråket Python skapas sju modeller med hjälp av olika regressionstekniker, där energiförbruknings- och simulerad solelproduktionsdata från södra Sverige används. Modellerna utvärderas med hjälp av determinationskoefficienten (R2) och mean absolute error (MAE). Teknikerna som används är linjär regression, polynomregression, Ridge regression, Lasso regression, K-nearest neighbor regression, Random Forest regression, Multi-Layer Perceptron regression. En additionell linjär regressions-modell skapas även med samma metodik som används i en tidigare publicerad rapport. En parametrisk analys av modellerna genomförs, där en variabel exkluderas åt gången för att bedöma modellens beroende av varje enskild variabel.  Resultaten är mycket lovande, där fem av de åtta undersökta modeller uppnår ett R2-värde över 0,9. Den bästa modellen, Random Forest, har ett R2 på 0,985 och ett MAE på 0,0148. Den parametriska analysen visar också att även om ingångsdata är till hjälp, är det tillräckligt att använda årlig energiförbrukning och årlig solcellsproduktion för att göra bra förutsägelser. Det måste dock påpekas att modellprestandan endast är tillförlitlig för södra Sverige, från var beräkningsdata är hämtad, och inte tillämplig för områden utanför de valda latituderna eller land.

Page generated in 0.0809 seconds