Global ETD Search

321	An Approach to Self-Supervised Object Localisation through Deep Learning Based Classification Politov, Andrei 28 December 2021 (has links) Deep learning has become ubiquitous in science and industry for classifying images or identifying patterns in data. The most widely used approach to training convolutional neural networks is supervised learning, which requires a large set of annotated data. To elude the high cost of collecting and annotating datasets, selfsupervised learning methods represent a promising way to learn the common functions of images and videos from large-scale unlabeled data without using humanannotated labels. This thesis provides the results of using self-supervised learning and explainable AI to localise objects in images from electron microscopes. The work used a synthetic geometric dataset and a synthetic pollen dataset. The classification was used as a pretext task. Different methods of explainable AI were applied: Grad-CAM and backpropagation-based approaches showed the lack of prospects; at the same time, the Extremal Perturbation function has shown efficiency. As a result of the downstream localisation task, the objects of interest were detected with competitive accuracy for one-class images. The advantages and limitations of the approach have been analysed. Directions for further work are proposed. info:eu-repo/classification/ddc/004 ddc:004
322	Using AI for Evaluating and Classifying E-mails with Limited Data Sets Malm, Daniel January 2022 (has links) Denna rapport utvärderar olika metoder för att klassificera och kategorisera email. Mångamail anländer hos människors inkorg varje dag. När tiden går och antalet email ökar blir detsvårare att hitta specifika email. På HDAB arbetar de som konsulter och vill dela upp email iolika mappar beroende på vilket projekt det tillhör. Idag fungerar det genom ett ord-regelbaseratsystem som sorterar email I olika mappar med en precision på cirka 85%. HDAB villta reda på om det går att använda maskininlärning för det nuvarande systemet. Denna rapportpresenterar fyra maskininlärningsalgorimer, beslutsträd, random forest beslutsträd, k-nearestneighbor och naive bayes, som användas för att utvärdera om det är möjligt att kategoriseraemailen.Datan som används till rapporten kommer från HDABs mailserver och är redan kategoriseradtill rätt kaegori. / This report will evaluate methods for classifying e-mails into different categories. A lot ofemails are received in peoples inboxes every day. When the time passes and the amount ofemails increases the ability to find specific emails gets harder. At HDAB they are workingwith consulting and want to separate different emails from different project into separate folders.This is achieved today by using a word based rule system that sorts emails into differentfolders and has a precision about 85%. HDAB wants to know if it is possible to use machinelearning to automatically sort the emails into different folders instead of the current solution.This report presents four machine learning algorithms, decision tree, random forest decisiontree, k-nearest neighbor and naive bayes, which are being used for evaluation of the possibilityto categorize the emails.The data used for the report will be data gathered from HDAB’s mail server and are alreadypre-labeled into their respectively categories. Email foldering supervised learning machine learning Epost mappning övervakad inlärning maskininlärning Computer Sciences Datavetenskap (datalogi)
323	A Unified Generative and Discriminative Approach to Automatic Chord Estimation for Music Audio Signals / 音楽音響信号に対する自動コード推定のための生成・識別統合的アプローチ Wu, Yiming 24 September 2021 (has links) 京都大学 / 新制・課程博士 / 博士(情報学) / 甲第23540号 / 情博第770号 / 新制\|\|情\|\|131(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)准教授吉井和佳, 教授河原達也, 教授西野恒, 教授鹿島久嗣 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Automatic chord estimation Variational autoencoder Semi-supervised learning Multi-task learning 007
324	Classification of Fiction Genres : Text classification of fiction texts from Project Gutenberg Bucher, Rolf January 2018 (has links) Stylometric analysis in text classification is most often used in authorship attribution studies. This thesis used a machine learning algorithm, the Naive Bayes Classifier, in a text classification task comparing stylometric and lexical features. The texts were extracted from the Project Gutenberg website and were comprised of three genres: detective fiction, fantasy, and science fiction. The aim was to see how well the classifier performed in a supervised learning task when it came to discerning genres from one another. R was used to extract the texts from Project Gutenberg and Python script was used to run the experiment. Approximately 1978 texts were extracted and preprocessed before univariate filtering and tf-idf weighting was used as the lexical feature while average sentence length, average word length, number of characters, number of punctuation marks, number of uppercase words, number of title case words, and parts-of-speech tags for nouns, verbs, and adjectives were generated as the feature sets for the topic independent stylometric features. Normalization was performed using the ℓ² norm for the tf-idf weighting, with the ℓ² norm and z-score standardization for the stylometric features. Multinomial Naive Bayes was performed on the lexical feature set and Gaussian Naive Bayeson the stylometric set, both with 10-fold cross-validation. Precision was used as the measure by which to assess the performance of the classifier. The classifier performed better in the lexical features experiment than the stylometric features experiment, suggesting that downsampling, more stylometric features, as well as more classes would have been beneficial. text classification genre machine learning supervised Gutenberg fiction Information Studies Biblioteks- och informationsvetenskap
325	Learning from Scholarly Attributed Graphs for Scientific Discovery Akujuobi, Uchenna Thankgod 18 October 2020 (has links) Research and experimentation in various scientific fields are based on the knowledge and ideas from scholarly literature. The advancement of research and development has, thus, strengthened the importance of literary analysis and understanding. However, in recent years, researchers have been facing massive scholarly documents published at an exponentially increasing rate. Analyzing this vast number of publications is far beyond the capability of individual researchers. This dissertation is motivated by the need for large scale analyses of the exploding number of scholarly literature for scientific knowledge discovery. In the first part of this dissertation, the interdependencies between scholarly literature are studied. First, I develop Delve – a data-driven search engine supported by our designed semi-supervised edge classification method. This system enables users to search and analyze the relationship between datasets and scholarly literature. Based on the Delve system, I propose to study information extraction as a node classification problem in attributed networks. Specifically, if we can learn the research topics of documents (nodes in a network), we can aggregate documents by topics and retrieve information specific to each topic (e.g., top-k popular datasets). Node classification in attributed networks has several challenges: a limited number of labeled nodes, effective fusion of topological structure and node/edge attributes, and the co-existence of multiple labels for one node. Existing node classification approaches can only address or partially address a few of these challenges. This dissertation addresses these challenges by proposing semi-supervised multi-class/multi-label node classification models to integrate node/edge attributes and topological relationships. The second part of this dissertation examines the problem of analyzing the interdependencies between terms in scholarly literature. I present two algorithms for the automatic hypothesis generation (HG) problem, which refers to the discovery of meaningful implicit connections between scientific terms, including but not limited to diseases, drugs, and genes extracted from databases of biomedical publications. The automatic hypothesis generation problem is modeled as a future connectivity prediction in a dynamic attributed graph. The key is to capture the temporal evolution of node-pair (term-pair) relations. Experiment results and case study analyses highlight the effectiveness of the proposed algorithms compared to the baselines’ extension. semi-supervised learning graph-based learning hypothesis generation reinforcement learning machine learning artificial intelligence
326	Indoor 3D Scene Understanding Using Depth Sensors Lahoud, Jean 09 1900 (has links) One of the main goals in computer vision is to achieve a human-like understanding of images. Nevertheless, image understanding has been mainly studied in the 2D image frame, so more information is needed to relate them to the 3D world. With the emergence of 3D sensors (e.g. the Microsoft Kinect), which provide depth along with color information, the task of propagating 2D knowledge into 3D becomes more attainable and enables interaction between a machine (e.g. robot) and its environment. This dissertation focuses on three aspects of indoor 3D scene understanding: (1) 2D-driven 3D object detection for single frame scenes with inherent 2D information, (2) 3D object instance segmentation for 3D reconstructed scenes, and (3) using room and floor orientation for automatic labeling of indoor scenes that could be used for self-supervised object segmentation. These methods allow capturing of physical extents of 3D objects, such as their sizes and actual locations within a scene. Depth sensors 3D Understanding 3D instance segmentation 3D object detection self-supervised pertaining object recognition
327	Determining an optimal approach for human occupancy recognition in a study room using non-intrusive sensors and machine learning Korduner, Lars, Sundquist, Mattias January 2019 (has links) Mänskligt igenkännande med användning av sensorer och maskininlärning är ett fält med många praktiska tillämpningar. Det finns några kommersiella produkter som på ett tillförlitligt sätt kan känna igen människor med hjälp av videokameror. Dock ger videokameror ofta en oro för inkräktning i privatlivet, men genom att läsa det relaterade arbetet kan man hävda att i vissa situationer är en videokamera inte nödvändigtvis mer tillförlitlig än billiga, icke-inkräktande sensorer. Att känna igen antalet människor i ett litet studie / kontorsrum är en sådan situation. Även om det har gjorts många framgångsrika studier för igenkänning av människor med olika sensorer och maskininlärningsalgoritmer, kvarstår en fråga om vilken kombination av sensorer och maskininlärningsalgoritmer som är allmänt bättre. Denna avhandling utgår från att testa fem lovande sensorer i kombination med sex olika maskininlärningsalgoritmer för att bestämma vilken kombination som överträffade resten. För att uppnå detta byggdes en arduino prototyp för att samla in och spara läsningarna från alla fem sensorer i en textfil varje sekund. Arduinon, tillsammans med sensorerna, placerades i ett litet studierum på Malmö universitet för att samla data vid två separata tillfällen medan studenterna använde rummet som vanligt. Den insamlade datan användes sedan för att träna och utvärdera fem maskininlärningsklassificerare för var och en av de möjliga kombinationerna av sensorer och maskininlärningsalgoritmer, för både igenkänningsdetektering och igenkänningsantal. I slutet av experimentet konstaterades det att alla algoritmer kunde uppnå en precision på minst 90% med vanligtvis mer än en kombination av sensorer. Den högsta träffsäkerheten som uppnåddes var 97%. / Human recognition with the use of sensors and machine learning is a field with many practical applications. There exists some commercial products that can reliably recognise humans with the use of video cameras. Video cameras often raises a concern about privacy though, by reading the related work one could argue that in some situations a video camera is not necessarily more reliable than low-cost, non-intrusive, ambient sensors. Human occupancy recognition in a small sized study/office room is one such situation. While there has been a lot of successful studies done on human occupancy recognition with various sensors and machine learning algorithms, a question about which combination of sensors and machine learning algorithms is more viable still remains. This thesis sets out to test five promising sensors in combination with six different machine learning algorithms to determine which combination outperformed the rest. To achieve this, an arduino prototype was built to collect and save the readings from all five sensors into a text file every second. The arduino, along with the sensors, was placed in a small study room at Malmö University to collect data on two separate occasions whilst students used the room as they would usually do. The collected data was then used to train and evaluate five machine learning classifier for each of the possible combinations of sensors and machine learning algorithms, for both occupancy detection and occupancy count. At the end of the experiment it was found that all algorithms could achieve an accuracy of at least 90% with usually more than one combination of sensors. The highest hit-rate achieved was 97%. Supervised Machine Learning Arduino Sensor Experiment Human Occupancy Recognition Engineering and Technology Teknik och teknologier
328	Géodétection des réseaux enterrés par imagerie radar / Geodection of buried utilities from radar imagery Terrasse, Guillaume 28 March 2017 (has links) L’objectif de la thèse est d’améliorer les différents traitements et de proposer une visualisation claire et intuitive à l’opérateur des données en sortie d’un géoradar (radargramme) afin de pouvoir localiser de manière précise les réseaux de canalisations enfouis. Notamment, nous souhaitons mettre en évidence les hyperboles présentes dans les radargrammes car celles-ci sont caractéristiques de la présence d'une canalisation. Dans un premier temps nous nous sommes intéressés à la suppression de l’information inutile (clutter) pouvant gêner la détection des hyperboles. Nous avons ainsi proposé une méthode de filtrage du clutter et du bruit des radargrammes. Ensuite, nous avons travaillé sur l’élaboration d’une méthode permettant de détecter automatiquement les hyperboles dans un radargramme ainsi qu’une estimation de sa fonction mathématique dans des conditions quasi-temps réel. Et enfin nous avons également proposé une méthode de séparation de source permettant de distinguer le clutter et le signal utile du radargramme tout en ayant un impact minimal sur les hyperboles. Ces derniers travaux ouvrent d’autres possibilités pour le filtrage, le rehaussement ou la détection automatique d’hyperboles. / The thesis objective is to improve the different processing in order to make the data acquired by ground penetrating radar (B-scan) more understandable for the operators. Consequently, it will facilitate the pipe localisation. More particularly, we wish to highlight the hyperbolas in the B-scan because they point out the presence of a pipe. First of all, we are interested in removing all the useless information which might hide the hyperbolas. We proposed a filtering method removing unwanted reflections and noise. Then, we worked on an automatic hyperbola detection method and an estimation of their mathematical functions in quasi real time. Finally, we proposed a source separation method to distinguish the unwanted reflections from the hyperbolas with a minimal impact on them. This last work opens interesting perspectives in filtering, hyperbolas enhancement and hyperbola detection. Géoradar Transformée en ondelettes Apprentissage supervisé Optimisation Ground penetrating radar Wavelet transform Supervised learning Optimization
329	Novel Semi-Supervised Learning Models to Balance Data Inclusivity and Usability in Healthcare Applications January 2019 (has links) abstract: Semi-supervised learning (SSL) is sub-field of statistical machine learning that is useful for problems that involve having only a few labeled instances with predictor (X) and target (Y) information, and abundance of unlabeled instances that only have predictor (X) information. SSL harnesses the target information available in the limited labeled data, as well as the information in the abundant unlabeled data to build strong predictive models. However, not all the included information is useful. For example, some features may correspond to noise and including them will hurt the predictive model performance. Additionally, some instances may not be as relevant to model building and their inclusion will increase training time and potentially hurt the model performance. The objective of this research is to develop novel SSL models to balance data inclusivity and usability. My dissertation research focuses on applications of SSL in healthcare, driven by problems in brain cancer radiomics, migraine imaging, and Parkinson’s Disease telemonitoring. The first topic introduces an integration of machine learning (ML) and a mechanistic model (PI) to develop an SSL model applied to predicting cell density of glioblastoma brain cancer using multi-parametric medical images. The proposed ML-PI hybrid model integrates imaging information from unbiopsied regions of the brain as well as underlying biological knowledge from the mechanistic model to predict spatial tumor density in the brain. The second topic develops a multi-modality imaging-based diagnostic decision support system (MMI-DDS). MMI-DDS consists of modality-wise principal components analysis to incorporate imaging features at different aggregation levels (e.g., voxel-wise, connectivity-based, etc.), a constrained particle swarm optimization (cPSO) feature selection algorithm, and a clinical utility engine that utilizes inverse operators on chosen principal components for white-box classification models. The final topic develops a new SSL regression model with integrated feature and instance selection called s2SSL (with “s2” referring to selection in two different ways: feature and instance). s2SSL integrates cPSO feature selection and graph-based instance selection to simultaneously choose the optimal features and instances and build accurate models for continuous prediction. s2SSL was applied to smartphone-based telemonitoring of Parkinson’s Disease patients. / Dissertation/Thesis / Doctoral Dissertation Industrial Engineering 2019 Industrial engineering Biomedical engineering Bioinformatics glioblastoma graph sampling migraine particle swarm optimization semi-supervised learning telemonitoring
330	Analog Implicit Functional Testing using Supervised Machine Learning Bawaskar, Neerja Pramod 27 October 2014 (has links) Testing analog circuits is more difficult than digital circuits. The reasons for this difficulty include continuous time and amplitude signals, lack of well-accepted testing techniques and time and cost required for its realization. The traditional method for testing analog circuits involves measuring all the performance parameters and comparing the measured parameters with the limits of the data-sheet specifications. Because of the large number of data-sheet specifications, the test generation and application requires long test times and expensive test equipment. This thesis proposes an implicit functional testing technique for analog circuits that can be easily implemented in BIST circuitry. The proposed technique does not require measuring data-sheet performance parameters. To simplify the testing only time domain digital input is required. For each circuit under test (CUT) a cross-covariance signature is computed from the test input and CUT's output. The proposed method requires a training sample of the CUT to be binned to the data-sheet specifications. The binned CUT sample cross-covariance signatures are mapped with a supervised machine learning classifier. For each bin, the classifiers select unique sub-sets of the cross-covariance signature. The trained classifier is then used to bin newly manufactured copies of the CUT. The proposed technique is evaluated on synthetic data generated from the Monte Carlo simulation of the nominal circuit. Results show the machine learning classifier must be chosen to match the imbalanced bin populations common in analog circuit testing. For sample sizes of 700+ and training for individual bins, classifier test escape rates ranged from 1000 DPM to 10,000 DPM. Analog integrated circuits -- Testing Electrical and Computer Engineering

Search results