Global ETD Search

421	Data-Driven Predictions of Heating Energy Savings in Residential Buildings Lindblom, Ellen, Almquist, Isabelle January 2019 (has links) Along with the increasing use of intermittent electricity sources, such as wind and sun, comes a growing demand for user flexibility. This has paved the way for a new market of services that provide electricity customers with energy saving solutions. These include a variety of techniques ranging from sophisticated control of the customers’ home equipment to information on how to adjust their consumption behavior in order to save energy. This master thesis work contributes further to this field by investigating an additional incentive; predictions of future energy savings related to indoor temperature. Five different machine learning models have been tuned and used to predict monthly heating energy consumption for a given set of homes. The model tuning process and performance evaluation were performed using 10-fold cross validation. The best performing model was then used to predict how much heating energy each individual household could save by decreasing their indoor temperature by 1°C during the heating season. The highest prediction accuracy (of about 78%) is achieved with support vector regression (SVR), closely followed by neural networks (NN). The simpler regression models that have been implemented are, however, not far behind. According to the SVR model, the average household is expected to lower their heating energy consumption by approximately 3% if the indoor temperature is decreased by 1°C. Building Energy Machine Learning Energy Savings Heating Energy Indoor Temperature Neural Networks Support Vector Regression Random Forest Ridge Regression K-Nearest Neighbors Energy Systems Energisystem Computer and Information Sciences Data- och informationsvetenskap
422	SPARSE DISCRETE WAVELET DECOMPOSITION AND FILTER BANK TECHNIQUES FOR SPEECH RECOGNITION Jingzhao Dai (6642491) 11 June 2019 (has links) <p>Speech recognition is widely applied to translation from speech to related text, voice driven commands, human machine interface and so on [1]-[8]. It has been increasingly proliferated to Human’s lives in the modern age. To improve the accuracy of speech recognition, various algorithms such as artificial neural network, hidden Markov model and so on have been developed [1], [2].</p> <p>In this thesis work, the tasks of speech recognition with various classifiers are investigated. The classifiers employed include the support vector machine (SVM), k-nearest neighbors (KNN), random forest (RF) and convolutional neural network (CNN). Two novel features extraction methods of sparse discrete wavelet decomposition (SDWD) and bandpass filtering (BPF) based on the Mel filter banks [9] are developed and proposed. In order to meet diversity of classification algorithms, one-dimensional (1D) and two-dimensional (2D) features are required to be obtained. The 1D features are the array of power coefficients in frequency bands, which are dedicated for training SVM, KNN and RF classifiers while the 2D features are formed both in frequency domain and temporal variations. In fact, the 2D feature consists of the power values in decomposed bands versus consecutive speech frames. Most importantly, the 2D feature with geometric transformation are adopted to train CNN.</p> <p>Speech recognition including males and females are from the recorded data set as well as the standard data set. Firstly, the recordings with little noise and clear pronunciation are applied with the proposed feature extraction methods. After many trials and experiments using this dataset, a high recognition accuracy is achieved. Then, these feature extraction methods are further applied to the standard recordings having random characteristics with ambient noise and unclear pronunciation. Many experiment results validate the effectiveness of the proposed feature extraction techniques.</p> Sparse discrete wavelet decomposition Bandpass filter banks Support vector machine Support vector classification Random forest K nearest neighbors Convolutional neural networks
423	An IoT Solution for Urban Noise Identification in Smart Cities : Noise Measurement and Classification Alsouda, Yasser January 2019 (has links) Noise is defined as any undesired sound. Urban noise and its effect on citizens area significant environmental problem, and the increasing level of noise has become a critical problem in some cities. Fortunately, noise pollution can be mitigated by better planning of urban areas or controlled by administrative regulations. However, the execution of such actions requires well-established systems for noise monitoring. In this thesis, we present a solution for noise measurement and classification using a low-power and inexpensive IoT unit. To measure the noise level, we implement an algorithm for calculating the sound pressure level in dB. We achieve a measurement error of less than 1 dB. Our machine learning-based method for noise classification uses Mel-frequency cepstral coefficients for audio feature extraction and four supervised classification algorithms (that is, support vector machine, k-nearest neighbors, bootstrap aggregating, and random forest). We evaluate our approach experimentally with a dataset of about 3000 sound samples grouped in eight sound classes (such as car horn, jackhammer, or street music). We explore the parameter space of the four algorithms to estimate the optimal parameter values for the classification of sound samples in the dataset under study. We achieve noise classification accuracy in the range of 88% – 94%. urban noise sound pressure level (SPL) internet of things (IoT) machine learning support vector machine (SVM) k-nearest neighbors (KNN) bootstrap aggregating (Bagging) random forest Elektroteknik och elektronik
424	Anomaly-based network intrusion detection enhancement by prediction threshold adaptation of binary classification models Al Tobi, Amjad Mohamed January 2018 (has links) Network traffic exhibits a high level of variability over short periods of time. This variability impacts negatively on the performance (accuracy) of anomaly-based network Intrusion Detection Systems (IDS) that are built using predictive models in a batch-learning setup. This thesis investigates how adapting the discriminating threshold of model predictions, specifically to the evaluated traffic, improves the detection rates of these Intrusion Detection models. Specifically, this thesis studied the adaptability features of three well known Machine Learning algorithms: C5.0, Random Forest, and Support Vector Machine. The ability of these algorithms to adapt their prediction thresholds was assessed and analysed under different scenarios that simulated real world settings using the prospective sampling approach. A new dataset (STA2018) was generated for this thesis and used for the analysis. This thesis has demonstrated empirically the importance of threshold adaptation in improving the accuracy of detection models when training and evaluation (test) traffic have different statistical properties. Further investigation was undertaken to analyse the effects of feature selection and data balancing processes on a model's accuracy when evaluation traffic with different significant features were used. The effects of threshold adaptation on reducing the accuracy degradation of these models was statistically analysed. The results showed that, of the three compared algorithms, Random Forest was the most adaptable and had the highest detection rates. This thesis then extended the analysis to apply threshold adaptation on sampled traffic subsets, by using different sample sizes, sampling strategies and label error rates. This investigation showed the robustness of the Random Forest algorithm in identifying the best threshold. The Random Forest algorithm only needed a sample that was 0.05% of the original evaluation traffic to identify a discriminating threshold with an overall accuracy rate of nearly 90% of the optimal threshold. 004
425	Image classification for a large number of object categories Bosch Rué, Anna 25 September 2007 (has links) L'increment de bases de dades que cada vegada contenen imatges més difícils i amb un nombre més elevat de categories, està forçant el desenvolupament de tècniques de representació d'imatges que siguin discriminatives quan es vol treballar amb múltiples classes i d'algorismes que siguin eficients en l'aprenentatge i classificació. Aquesta tesi explora el problema de classificar les imatges segons l'objecte que contenen quan es disposa d'un gran nombre de categories. Primerament s'investiga com un sistema híbrid format per un model generatiu i un model discriminatiu pot beneficiar la tasca de classificació d'imatges on el nivell d'anotació humà sigui mínim. Per aquesta tasca introduïm un nou vocabulari utilitzant una representació densa de descriptors color-SIFT, i desprès s'investiga com els diferents paràmetres afecten la classificació final. Tot seguit es proposa un mètode par tal d'incorporar informació espacial amb el sistema híbrid, mostrant que la informació de context es de gran ajuda per la classificació d'imatges. Desprès introduïm un nou descriptor de forma que representa la imatge segons la seva forma local i la seva forma espacial, tot junt amb un kernel que incorpora aquesta informació espacial en forma piramidal. La forma es representada per un vector compacte obtenint un descriptor molt adequat per ésser utilitzat amb algorismes d'aprenentatge amb kernels. Els experiments realitzats postren que aquesta informació de forma te uns resultats semblants (i a vegades millors) als descriptors basats en aparença. També s'investiga com diferents característiques es poden combinar per ésser utilitzades en la classificació d'imatges i es mostra com el descriptor de forma proposat juntament amb un descriptor d'aparença millora substancialment la classificació. Finalment es descriu un algoritme que detecta les regions d'interès automàticament durant l'entrenament i la classificació. Això proporciona un mètode per inhibir el fons de la imatge i afegeix invariança a la posició dels objectes dins les imatges. S'ensenya que la forma i l'aparença sobre aquesta regió d'interès i utilitzant els classificadors random forests millora la classificació i el temps computacional. Es comparen els postres resultats amb resultats de la literatura utilitzant les mateixes bases de dades que els autors Aixa com els mateixos protocols d'aprenentatge i classificació. Es veu com totes les innovacions introduïdes incrementen la classificació final de les imatges. / The release of challenging data sets with ever increasing numbers of object categories isforcing the development of image representations that can cope with multiple classes andof algorithms that are efficient in training and testing. This thesis explores the problem ofclassifying images by the object they contain in the case of a large number of categories. We first investigate weather the hybrid combination of a latent generative model with a discriminative classifier is beneficial for the task of weakly supervised image classification.We introduce a novel vocabulary using dense color SIFT descriptors, and then investigate classification performances by optimizing different parameters. A new way to incorporate spatial information within the hybrid system is also proposed showing that contextual information provides a strong support for image classification. We then introduce a new shape descriptor that represents local image shape and its spatial layout, together with a spatial pyramid kernel. Shape is represented as a compactvector descriptor suitable for use in standard learning algorithms with kernels. Experimentalresults show that shape information has similar classification performances and sometimes outperforms those methods using only appearance information. We also investigate how different cues of image information can be used together. Wewill see that shape and appearance kernels may be combined and that additional informationcues increase classification performance. Finally we provide an algorithm to automatically select the regions of interest in training. This provides a method of inhibiting background clutter and adding invariance to the object instance's position. We show that shape and appearance representation over the regions of interest together with a random forest classifier which automatically selects the best cues increases on performance and speed. We compare our classification performance to that of previous methods using the authors'own datasets and testing protocols. We will see that the set of innovations introduced here lead for an impressive increase on performance. Categorias de objetos Object categories Modelo discriminativo Model discriminatiu Discriminative model Random forest Modelo generativo Model generatiu Generative model Regiones de interés Regions d'interès Region of interest Clasificación de imágenes Classificació d'imatges Image classification Categories d'objectes pLSA Probabilistic Latent Semantic Analysis 004 68
426	Extracting meaningful statistics for the characterization and classification of biological, medical, and financial data Woods, Tonya M. 21 September 2015 (has links) This thesis is focused on extracting meaningful statistics for the characterization and classification of biological, medical, and financial data and contains four chapters. The first chapter contains theoretical background on scaling and wavelets, which supports the work in chapters two and three. In the second chapter, we outline a methodology for representing sequences of DNA nucleotides as numeric matrices in order to analytically investigate important structural characteristics of DNA. This methodology involves assigning unit vectors to nucleotides, placing the vectors into columns of a matrix, and accumulating across the rows of this matrix. Transcribing the DNA in this way allows us to compute the 2-D wavelet transformation and assess regularity characteristics of the sequence via the slope of the wavelet spectra. In addition to computing a global slope measure for a sequence, we can apply our methodology for overlapping sections of nucleotides to obtain an evolutionary slope. In the third chapter, we describe various ways wavelet-based scaling may be used for cancer diagnostics. There were nearly half of a million new cases of ovarian, breast, and lung cancer in the United States last year. Breast and lung cancer have highest prevalence, while ovarian cancer has the lowest survival rate of the three. Early detection is critical for all of these diseases, but substantial obstacles to early detection exist in each case. In this work, we use wavelet-based scaling on metabolic data and radiography images in order to produce meaningful features to be used in classifying cases and controls. Computer-aided detection (CAD) algorithms for detecting lung and breast cancer often focus on select features in an image and make a priori assumptions about the nature of a nodule or a mass. In contrast, our approach to analyzing breast and lung images captures information contained in the background tissue of images as well as information about specific features and makes no such a priori assumptions. In the fourth chapter, we investigate the value of social media data in building commercial default and activity credit models. We use random forest modeling, which has been shown in many instances to achieve better predictive accuracy than logistic regression in modeling credit data. This result is of interest, as some entities are beginning to build credit scores based on this type of publicly available online data alone. Our work has shown that the addition of social media data does not provide any improvement in model accuracy over the bureau only models. However, the social media data on its own does have some limited predictive power. Wavelets Scaling Regularity Classification SVM GC content Exons Introns Ovarian cancer Breast cancer Mammography Lung cancer Lung CXR Credit risk Response model Random forest Social media data Online review data
427	Predicting inter-frequency measurements in an LTE network using supervised machine learning : a comparative study of learning algorithms and data processing techniques / Att prediktera inter-frekvensmätningar i ett LTE-nätverk med hjälp av övervakad maskininlärning Sonnert, Adrian January 2018 (has links) With increasing demands on network reliability and speed, network suppliers need to effectivize their communications algorithms. Frequency measurements are a core part of mobile network communications, increasing their effectiveness would increase the effectiveness of many network processes such as handovers, load balancing, and carrier aggregation. This study examines the possibility of using supervised learning to predict the signal of inter-frequency measurements by investigating various learning algorithms and pre-processing techniques. We found that random forests have the highest predictive performance on this data set, at 90.7\% accuracy. In addition, we have shown that undersampling and varying the discriminator are effective techniques for increasing the performance on the positive class on frequencies where the negative class is prevalent. Finally, we present hybrid algorithms in which the learning algorithm for each model depends on attributes of the training data set. These algorithms perform at a much higher efficiency in terms of memory and run-time without heavily sacrificing predictive performance. Telecommunications Telecom Mobile networks 4G LTE LTE-A Machine learning Random forest Gradient boosting Neural network Multi-layer perceptron Logistic regression Frequency measurements Handover Load balancing Carrier aggregation Computer and Information Sciences Data- och informationsvetenskap
428	Pupilometria na investigação de diabetes mellitus tipo II / Pupilometry in the Investigation of diabetes mellitus type II Silva, Cleyton Rafael Gomes 28 September 2018 (has links) Submitted by Luciana Ferreira (lucgeral@gmail.com) on 2018-11-14T12:45:33Z No. of bitstreams: 2 Dissertação - Cleyton Rafael Gomes Silva - 2018.pdf: 3259568 bytes, checksum: 21f7d8194e8929ef29e8df95ef8f6a0a (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Approved for entry into archive by Luciana Ferreira (lucgeral@gmail.com) on 2018-11-14T13:02:15Z (GMT) No. of bitstreams: 2 Dissertação - Cleyton Rafael Gomes Silva - 2018.pdf: 3259568 bytes, checksum: 21f7d8194e8929ef29e8df95ef8f6a0a (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) / Made available in DSpace on 2018-11-14T13:02:15Z (GMT). No. of bitstreams: 2 Dissertação - Cleyton Rafael Gomes Silva - 2018.pdf: 3259568 bytes, checksum: 21f7d8194e8929ef29e8df95ef8f6a0a (MD5) license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Previous issue date: 2018-09-28 / Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES / Examining human pupillary behavior is a non-invasive, low-cost method for assessing neurological activity. Changes in this behavior are correlated to various health conditions, such as: Parkinson’s, Alzheimer’s, autism and diabetes. In order to obtain information about the pupillary behavior, it is necessary to measure the pupil diameter in procedures that induce pupillary reflexes, known as Pupilometry. Pupillary measurement is made by filming the procedures when applying computer vision techniques for pupil recognition. The objective of this research was to develop an Automated Pupilometry System (SAP) to support the investigation of patients with type II diabetes mellitus. SAP was able to record, induce, and extract 96 pupil features. In the experiment with 15 healthy patients and 16 diabetics, a 94% accuracy in the identification of diabetics type II was obtained, demonstrating the efficiency of SAP for the performance of examinations, and evidencing the potential of pupil use in the investigation of diabetes mellitus type II. / Examinar o comportamento pupilar humano é um método não-invasivo e de baixo-custo para avaliar atividade neurológica. Alterações neste comportamento são correlacionadas a várias condições de saúde, como: Parkinson, Alzheimer, autismo e diabetes. Para se obter informações do comportamento pupilar é necessário medir o diâmetro da pupila em procedimentos que induzem os reflexos pupilares, conhecidos como Pupilometria. A medição pupilar é feita por meio da filmagem dos procedimentos ao aplicar-se técnicas de visão computacional para reconhecimento da pupila. O objetivo desta pesquisa foi desenvolver um Sistema Automatizado de Pupilometria (SAP) para apoiar a investigação de pacientes com diabetes mellitus tipo II. O SAP foi capaz de gravar, induzir, e extrair 96 característicaspupilares. No experimento com 15 pacientes saudáveis e 16 diabéticos foi obtida uma acurácia de 94% na identificação de diabéticos tipo II, demonstrando a eficiência do SAP para a performance de exames, e evidenciando o potencial do uso da pupila na investigação de diabetes mellitus tipo II. Sistema automatizado de pupilometria Comportamento pupilar humano Visão computacional Floresta aleatória Diabetes mellitus tipo II Automated pupillometry system Human pupillary behavior Computer vision Random forest Diabetes mellitus type II
429	Využití vybraných metod strojového učení pro modelování kreditního rizika / Machine Learning Methods for Credit Risk Modelling Drábek, Matěj January 2017 (has links) This master's thesis is divided into three parts. In the first part I described P2P lending, its characteristics, basic concepts and practical implications. I also compared P2P market in the Czech Republic, UK and USA. The second part consists of theoretical basics for chosen methods of machine learning, which are naive bayes classifier, classification tree, random forest and logistic regression. I also described methods to evaluate the quality of classification models listed above. The third part is a practical one and shows the complete workflow of creating classification model, from data preparation to evaluation of model.
430	Methodology of surface defect detection using machine vision with magnetic particle inspection on tubular material / Méthodologie de détection des défauts de surface par vision artificielle avec magnetic particle inspection sur le matériel tubulaire Mahendra, Adhiguna 08 November 2012 (has links) [...]L’inspection des surfaces considérées est basée sur la technique d’Inspection par Particules Magnétiques (Magnetic Particle Inspection (MPI)) qui révèle les défauts de surfaces après les traitements suivants : la surface est enduite d’une solution contenant les particules, puis magnétisées et soumise à un éclairage Ultra-Violet. La technique de contrôle non destructif MPI est une méthode bien connue qui permet de révéler la présence de fissures en surface d’un matériau métallique. Cependant, une fois le défaut révélé par le procédé, ladétection automatique sans intervention de l’opérateur en toujours problématique et à ce jour l'inspection basée sur le procédé MPI des matériaux tubulaires sur les sites de production deVallourec est toujours effectuée sur le jugement d’un opérateur humain. Dans cette thèse, nous proposons une approche par vision artificielle pour détecter automatiquement les défauts à partir des images de la surface de tubes après traitement MPI. Nous avons développé étape par étape une méthodologie de vision artificielle de l'acquisition d'images à la classification.[...] La première étape est la mise au point d’un prototype d'acquisition d’images de la surface des tubes. Une série d’images a tout d’abord été stockée afin de produire une base de données. La version actuelle du logiciel permet soit d’enrichir la base de donnée soit d’effectuer le traitement direct d’une nouvelle image : segmentation et saisie de la géométrie (caractéristiques de courbure) des défauts. Mis à part les caractéristiques géométriques et d’intensité, une analyse multi résolution a été réalisée sur les images pour extraire des caractéristiques texturales. Enfin la classification est effectuée selon deux classes : défauts et de non-défauts. Celle ci est réalisée avec le classificateur des forêts aléatoires (Random Forest) dont les résultats sontcomparés avec les méthodes Support Vector Machine et les arbres de décision.La principale contribution de cette thèse est l'optimisation des paramètres utilisées dans les étapes de segmentations dont ceux des filtres de morphologie mathématique, du filtrage linéaire utilisé et de la classification avec la méthode robuste des plans d’expériences (Taguchi), très utilisée dans le secteur de la fabrication. Cette étape d’optimisation a été complétée par les algorithmes génétiques. Cette méthodologie d’optimisation des paramètres des algorithmes a permis un gain de temps et d’efficacité significatif. La seconde contribution concerne la méthode d’extraction et de sélection des caractéristiques des défauts. Au cours de cette thèse, nous avons travaillé sur deux bases de données d’images correspondant à deux types de tubes : « Tool Joints » et « Tubes Coupling ». Dans chaque cas un tiers des images est utilisé pour l’apprentissage. Nous concluons que le classifieur du type« Random Forest » combiné avec les caractéristiques géométriques et les caractéristiques detexture extraites à partir d’une décomposition en ondelettes donne le meilleur taux declassification pour les défauts sur des pièces de « Tool Joints »(95,5%) (Figure 1). Dans le cas des « coupling tubes », le meilleur taux de classification a été obtenu par les SVM avec l’analyse multirésolution (89.2%) (figure.2) mais l’approche Random Forest donne un bon compromis à 82.4%. En conclusion la principale contrainte industrielle d’obtenir un taux de détection de défaut de 100% est ici approchée mais avec un taux de l’ordre de 90%. Les taux de mauvaises détections (Faux positifs ou Faux Négatifs) peuvent être améliorés, leur origine étant dans l’aspect de l’usinage du tube dans certaines parties, « Hard Bending ».De plus, la méthodologie développée peut être appliquée à l’inspection, par MPI ou non, de différentes lignes de produits métalliques / Industrial surface inspection of tubular material based on Magnetic Particle Inspection (MPI) is a challenging task. Magnetic Particle Inspection is a well known method for Non Destructive Testing with the goal to detect the presence of crack in the tubular surface. Currently Magnetic Particle Inspection for tubular material in Vallourec production site is stillbased on the human inspector judgment. It is time consuming and tedious job. In addition, itis prone to error due to human eye fatigue. In this thesis we propose a machine vision approach in order to detect the defect in the tubular surface MPI images automatically without human supervision with the best detection rate. We focused on crack like defects since they represent the major ones. In order to fulfill the objective, a methodology of machine vision techniques is developed step by step from image acquisition to defect classification. The proposed framework was developed according to industrial constraint and standard hence accuracy, computational speed and simplicity were very important. Based on Magnetic Particle Inspection principles, an acquisition system is developed and optimized, in order to acquire tubular material images for storage or processing. The characteristics of the crack-like defects with respect to its geometric model and curvature characteristics are used as priory knowledge for mathematical morphology and linear filtering. After the segmentation and binarization of the image, vast amount of defect candidates exist. Aside from geometrical and intensity features, Multi resolution Analysis wasperformed on the images to extract textural features. Finally classification is performed with Random Forest classifier due to its robustness and speed and compared with other classifiers such as with Support Vector Machine Classifier. The parameters for mathematical morphology, linear filtering and classification are analyzed and optimized with Design Of Experiments based on Taguchi approach and Genetic Algorithm. The most significant parameters obtained may be analyzed and tuned further. Experiments are performed ontubular materials and evaluated by its accuracy and robustness by comparing ground truth and processed images. This methodology can be replicated for different surface inspection application especially related with surface crack detection Détection des défauts Morphologie mathématique Plan d’expériences Taguchi Optimisation Classification Inspection de surface Vision artificielle Algorithmes génétiques Defect detection Mathematical morphology Design of Experiments Taguchi Optimization Classification Surface inspection Machine vision Genetic Algorithm Random Forest 005.1 006.3 519 621.3

Search results