Global ETD Search

111	Comparing Julia and Python : An investigation of the performance on image processing with deep neural networks and classification Axillus, Viktor January 2020 (has links) Python is the most popular language when it comes to prototyping and developing machine learning algorithms. Python is an interpreted language that causes it to have a significant performance loss compared to compiled languages. Julia is a newly developed language that tries to bridge the gap between high performance but cumbersome languages such as C++ and highly abstracted but typically slow languages such as Python. However, over the years, the Python community have developed a lot of tools that addresses its performance problems. This raises the question if choosing one language over the other has any significant performance difference. This thesis compares the performance, in terms of execution time, of the two languages in the machine learning domain. More specifically, image processing with GPU-accelerated deep neural networks and classification with k-nearest neighbor on the MNIST and EMNIST dataset. Python with Keras and Tensorflow is compared against Julia with Flux for GPU-accelerated neural networks. For classification Python with Scikit-learn is compared against Julia with Nearestneighbors.jl. The results point in the direction that Julia has a performance edge in regards to GPU-accelerated deep neural networks. With Julia outperforming Python by roughly 1.25x − 1.5x. For classification with k-nearest neighbor the results were a bit more varied with Julia outperforming Python in 5 out of 8 different measurements. However, there exists some validity threats and additional research is needed that includes all different frameworks available for the languages in order to provide a more conclusive and generalized answer. julia python performance comparison machine learning image processing GPU GPU-acceleration neural networks autoencoder classification knn k-nearest neighbor Software Engineering Programvaruteknik
112	Diagnostic prediction on anamnesis in digital primary health care / Diagnostisk predicering genom anamnes inom den digitala primärvården Kindblom, Marie January 2018 (has links) Primary health care is facing extensive changes due to digitalization, while the field of application for machine learning is expanding. The merging of these two fields could result in a range of outcomes, one of them being an improved and more rigorous adoption of clinical decision support systems. Clinical decision support systems have been around for a long time but are still not fully adopted in primary health care due to insufficient performance and interpretation. Clinical decision support systems have a range of supportive functions to assist the clinician during decision making, where one of the most researched topics is diagnostic support. This thesis investigates how the use of self-described anamnesis in the form of free text and multiple-choice questions performs in prediction of diagnostic outcome. The chosen approach is to compare text to different subsets of multiple-choice questions for diagnostic prediction on a range of classification methods. The results indicate that text data holds a substantial amount of information, and that the multiple-choice questions used in this study are of varying quality, yet suboptimal compared to text data. The over-all tendency is that Support Vector Machines perform well on text classification and that Random Forests and Naive Bayes have equal performance to Support Vector Machines on multiple-choice questions. / Primärvården förväntas genomgå en utbredd digitalisering under de kommande åren, samtidigt som maskininlärning får utökade tillämpningsområden. Sammanslagningen av dessa två fält möjliggör en mängd förbättrade tekniker, varav en vore ett förbättrat och mer rigoröst anammande av kliniska beslutsstödsystem. Det har länge funnits varianter av kliniska beslutsstödsystem, men de har ännu inte lyckats blivit fullständigt inkorporerade i primärvården, framför allt på̊ grund av bristfällig prestanda och förmåga till tolkning. Kliniskt beslutstöd erbjuder en mängd funktioner för läkare vid beslutsfattning, där ett av de mest uppmärksammade fälten inom forskningen är support vid diagnosticering. Denna uppsats ämnar att undersöka hur självbeskriven anamnes i form av fritext och flervalsfrågor presterar för förutsägning av diagnos. Det valda tillvägagångssättet har varit att jämföra text med olika delmängder av flervalsfrågor med hjälp av en mängd metoder för klassificering. Resultaten indikerar att textdatan innehåller en avsevärt större mängd information än flervalsfrågorna, samt att flervalsfrågorna som har använts i denna studie är av varierande kvalité, men generellt sett suboptimala vad gäller prestanda i jämförelse med textdatan. Den generella tendensen är att Support Vector Machines presterar bra för klassificering med text data medan Random Forests och Naive Bayes är likvärdiga alternativ till Support Vector Machines för predicering vid användning av flervalsfrågor. computer science machine learning CDSS diagnosis anamnesis digital health health care primary health care SVM kNN RF NB MLP ANN Computer Sciences Datavetenskap (datalogi)
113	Classification of Healthy and Alzheimer's Patients Using Electroencephalography and Supervised Machine Learning / Klassifiering av friska och alzheimers patienter med hjälp av elektroencefalografi och maskininlärning Javanmardi, Ramtin, Rehman, Dawood January 2018 (has links) Alzheimer’s is one of the most costly illnesses that exists today and the number of people with alzheimers diease is expected to increase with 100 million until the year 2050. The medication that exists today is most effective if Alzheimer’s is detected during early stages since these medications do not cure Alzheimer’s but slows down the progression of the disease. Electroencephalography (EEG) is a relatively cheap method in comparison to for example Magnetic Resonance Imaging when it comes to diagnostic tools. However it is not clear how to deduce whether a patient has Alzheimer’s disease just from EEG data when the analyst is a human. This is the underlying motivation for our investigation; can supervised machine learning methods be used for pattern recognition using only the spectral power of EEG data to tell whether an individual has alzheimer’s disease or not? The output accuracy of the trained supervised machine learning models showed an average accuracy of above 80%. This indicates that there is a difference in the neural oscillations of the brain between healthy individuals and alzheimer’s disease patients which the machine learning methods are able to detect using pattern recognition. / Alzheimer är en av de mest kostsamma sjukdomar som existerar idag och antalet människor med alzheimer förväntas öka med omkring 100 miljoner människor tills 2050. Den medicinska hjälp som finns tillgänglig idag är som mest effektiv om man upptäcker Alzheimer i ett tidigt stadium eftersom dagens mediciner inte botar sjukdomen utan fungerar som bromsmedicin. Elektroencefalografi är en relativt billig metod för diagnostisering jämfört med Magnetisk resonanstomografi. Det är emellertid inte tydligt hur en läkare eller annan tränad individ ska tolka EEG datan för att kunna avgöra om det är en patient med alzheimers som de kollar på. Så den bakomliggande motivation till vår undersökning är; Kan man med hjälp av övervakad maskininlärning i kombination med spektral kraft från EEG datorn skapa modeller som kan avgöra om en patient har alzheimers eller inte. Medelvärdet av våra modellers noggrannhet var över 80%. Detta tyder på att det finns en faktiskt skillnad mellan hjärna signalerna hos en patient med alzheimer och en frisk individ, och att man med hjälp av maskininlärning kan hitta dessa skillnader som en människa enkelt missar. EEG Alzheimer's Machine learning Supervised machine learning Spectral power Feature extraction Support vector machine svm knn lda Computer Sciences Datavetenskap (datalogi)
114	Efficient And Scalable Evaluation Of Continuous, Spatio-temporal Queries In Mobile Computing Environments Cazalas, Jonathan M 01 January 2012 (has links) A variety of research exists for the processing of continuous queries in large, mobile environments. Each method tries, in its own way, to address the computational bottleneck of constantly processing so many queries. For this research, we present a two-pronged approach at addressing this problem. Firstly, we introduce an efficient and scalable system for monitoring traditional, continuous queries by leveraging the parallel processing capability of the Graphics Processing Unit. We examine a naive CPU-based solution for continuous range-monitoring queries, and we then extend this system using the GPU. Additionally, with mobile communication devices becoming commodity, location-based services will become ubiquitous. To cope with the very high intensity of location-based queries, we propose a view oriented approach of the location database, thereby reducing computation costs by exploiting computation sharing amongst queries requiring the same view. Our studies show that by exploiting the parallel processing power of the GPU, we are able to significantly scale the number of mobile objects, while maintaining an acceptable level of performance. Our second approach was to view this research problem as one belonging to the domain of data streams. Several works have convincingly argued that the two research fields of spatiotemporal data streams and the management of moving objects can naturally come together. [IlMI10, ChFr03, MoXA04] For example, the output of a GPS receiver, monitoring the position of a mobile object, is viewed as a data stream of location updates. This data stream of location updates, along with those from the plausibly many other mobile objects, is received at a centralized server, which processes the streams upon arrival, effectively updating the answers to the currently active queries in real time. iv For this second approach, we present GEDS, a scalable, Graphics Processing Unit (GPU)-based framework for the evaluation of continuous spatio-temporal queries over spatiotemporal data streams. Specifically, GEDS employs the computation sharing and parallel processing paradigms to deliver scalability in the evaluation of continuous, spatio-temporal range queries and continuous, spatio-temporal kNN queries. The GEDS framework utilizes the parallel processing capability of the GPU, a stream processor by trade, to handle the computation required in this application. Experimental evaluation shows promising performance and shows the scalability and efficacy of GEDS in spatio-temporal data streaming environments. Additional performance studies demonstrate that, even in light of the costs associated with memory transfers, the parallel processing power provided by GEDS clearly counters and outweighs any associated costs. Finally, in an effort to move beyond the analysis of specific algorithms over the GEDS framework, we take a broader approach in our analysis of GPU computing. What algorithms are appropriate for the GPU? What types of applications can benefit from the parallel and stream processing power of the GPU? And can we identify a class of algorithms that are best suited for GPU computing? To answer these questions, we develop an abstract performance model, detailing the relationship between the CPU and the GPU. From this model, we are able to extrapolate a list of attributes common to successful GPU-based applications, thereby providing insight into which algorithms and applications are best suited for the GPU and also providing an estimated theoretical speedup for said GPU-based applications Mobile computing continuous queries data streams spatio temporal queries spatio temporal data streams evaluation scalable range query knn gpu geds nvidia cuda Computer Sciences Engineering
115	Approches variationnelles statistiques spatio-temporelles pour l'analyse quantitative de la perfusion myocardique en IRM / Spatio-temporal statistical variational models for the quantitative assessment of myocardial perfusion in magnetic resonance imaging Hamrouni-Chtourou, Sameh 11 July 2012 (has links) L'analyse quantitative de la perfusion myocardique, i.e. l'estimation d'indices de perfusion segmentaires puis leur confrontation à des valeurs normatives, constitue un enjeu majeur pour le dépistage, le traitement et le suivi des cardiomyopathies ischémiques --parmi les premières causes de mortalité dans les pays occidentaux. Dans la dernière décennie, l'imagerie par résonance magnétique de perfusion (IRM-p) est la modalité privilégiée pour l'exploration dynamique non-invasive de la perfusion cardiaque. L'IRM-p consiste à acquérir des séries temporelles d'images cardiaques en incidence petit-axe et à plusieurs niveaux de coupe le long du grand axe du cœur durant le transit d'un agent de contraste vasculaire dans les cavités et le muscle cardiaques. Les examens IRM-p résultants présentent de fortes variations non linéaires de contraste et des artefacts de mouvements cardio-respiratoires. Dans ces conditions, l'analyse quantitative de la perfusion myocardique est confrontée aux problèmes complexes de recalage et de segmentation de structures cardiaques non rigides dans des examens IRM-p. Cette thèse se propose d'automatiser l’analyse quantitative de la perfusion du myocarde en développant un outil d'aide au diagnostic non supervisé dédié à l'IRM de perfusion cardiaque de premier passage, comprenant quatre étapes de traitement : -1.sélection automatique d'une région d'intérêt centrée sur le cœur; -2.compensation non rigide des mouvements cardio-respiratoires sur l'intégralité de l'examen traité; -3.segmentation des contours cardiaques; -4.quantification de la perfusion myocardique. Les réponses que nous apportons aux différents défis identifiés dans chaque étape s'articulent autour d'une idée commune : exploiter l'information liée à la cinématique de transit de l'agent de contraste dans les tissus pour discriminer les structures anatomiques et guider le processus de recalage des données. Ce dernier constitue le travail central de cette thèse. Les méthodes de recalage non rigide d'images fondées sur l'optimisation de mesures d'information constituent une référence en imagerie médicale. Leur cadre d'application usuel est l'alignement de paires d'images par appariement statistique de distributions de luminance, manipulées via leurs densités de probabilité marginales et conjointes, estimées par des méthodes à noyaux. Efficaces pour des densités jointes présentant des classes individualisées ou réductibles à des mélanges simples, ces approches atteignent leurs limites pour des mélanges non-linéaires où la luminance au pixel s’avère être un attribut trop frustre pour permettre une décision statistique discriminante, et pour des données mono-modal avec variations non linéaires et multi-modal. Cette thèse introduit un modèle mathématique de recalage informationnel multi-attributs/multi-vues générique répondant aux défis identifiés: (i) alignement simultané de l'intégralité de l'examen IRM-p analysé par usage d'un atlas, naturel ou synthétique, dans lequel le cœur est immobile et en utilisant les courbes de rehaussement au pixel comme ensemble dense de primitives; et (ii) capacité à intégrer des primitives image composites, spatiales ou spatio-temporelles, de grande dimension. Ce modèle, disponible dans le cadre classique de Shannon et dans le cadre généralisé d'Ali-Silvey, est fondé sur de nouveaux estimateurs géométriques de type k plus proches voisins des mesures d'information, consistants en dimension arbitraire. Nous étudions leur optimisation variationnelle en dérivant des expressions analytiques de leurs gradients sur des espaces de transformations spatiales régulières de dimension finie et infinie, et en proposant des schémas numériques et algorithmiques de descente en gradient efficace. Ce modèle de portée générale est ensuite instancié au cadre médical ciblé, et ses performances, notamment en terme de précision et de robustesse, sont évaluées dans le cadre d'un protocole expérimental tant qualitatif que quantitatif / Quantitative assessment of moycardium perfusion, i.e. computation of perfusion parameters which are then confronted to normative values, is a key issue for the diagnosis, therapy planning and monitoring of ischemic cardiomyopathies --the leading cause of death in Western countries. Within the last decade, perfusion magnetic resonance imaging (p-MRI) has emerged as a reference modality for reliably assessing myocardial perfusion in a noninvasive and accurate way. In p-MRI acquisitions, short-axis image sequences are captured at multiple slice levels along the long-axis of the heart during the transit of a vascular contrast agent through the cardiac chambers and muscle. Resulting p-MRI exams exhibit high nonlinear contrast variations and complex cardio-thoracic motions. Perfusion assessment is then faced with the complex problems of non rigid registration and segmentation of cardiac structures in p-MRI exams. The objective of this thesis is enabling an automated quantitative computer-aided diagnosis tool for first pass cardiac perfusion MRI, comprising four processing steps: -1.automated cardiac region of interest extraction; -2.non rigid registration of cardio-thoracic motions throughout the whole sequence; -3.cardiac boundaries segmentation; -4.quantification of myocardial perfusion. The answers we give to the various challenges identified in each step are based on a common idea: investigating information related to the kinematics of contrast agent transit in the tissues for discriminating the anatomical structures and driving the alignment process. This latter is the main work of this thesis. Non rigid image registration methods based on the optimization of information measures provide versatile solutions for robustly aligning medical data. Their usual application setting is the alignment of image pairs by statistically matching luminance distributions, handled using marginal and joint probability densities estimated via kernel techniques. Though efficient for joint densities exhibiting well-separated clusters or reducible to simple mixtures, these approaches reach their limits for nonlinear mixtures where pixelwise luminance appears to be a too coarse feature for allowing unambiguous statistical decisions, and for mono-modal with nonlinear variations and multi-modal data. This thesis presents a unified mathematical model for the information-theoretic multi-feature/multi-view non rigid registration, addressing the identified challenges : (i) simultaneous registration of the whole p-MRI exam, using a natural or synthetic atlas generated as a motion-free exam depicting the transit of the vascular contrast agent through cardiac structures and using local contrast enhancement curves as a feature set; (ii) can be easily generalized to richer feature spaces combining radiometric and geometric information. The resulting model is based on novel consistent k-nearest neighbors estimators of information measures in high dimension, for both classical Shannon and generalized Ali-Silvey frameworks. We study their variational optimization by deriving under closed-form their gradient flows over finite and infinite dimensional smooth transform spaces, and by proposing computationally efficient gradient descent schemas. The resulting generic theoretical framework is applied to the groupwise alignment of cardiac p-MRI exams, and its performances, in terms of accuracy and robustness, are evaluated in an experimental qualitative and quantitative protocol Recalage non rigide multi-attributs Recalage par groupe Estimateurs entropiques aux kNN IRM cardiaque de perfusion Groupwise registration KNN entropy estimators Cardiac perfusion MRI Quantification of myocardium perfusion
116	Textual data mining applications for industrial knowledge management solutions Ur-Rahman, Nadeem January 2010 (has links) In recent years knowledge has become an important resource to enhance the business and many activities are required to manage these knowledge resources well and help companies to remain competitive within industrial environments. The data available in most industrial setups is complex in nature and multiple different data formats may be generated to track the progress of different projects either related to developing new products or providing better services to the customers. Knowledge Discovery from different databases requires considerable efforts and energies and data mining techniques serve the purpose through handling structured data formats. If however the data is semi-structured or unstructured the combined efforts of data and text mining technologies may be needed to bring fruitful results. This thesis focuses on issues related to discovery of knowledge from semi-structured or unstructured data formats through the applications of textual data mining techniques to automate the classification of textual information into two different categories or classes which can then be used to help manage the knowledge available in multiple data formats. Applications of different data mining techniques to discover valuable information and knowledge from manufacturing or construction industries have been explored as part of a literature review. The application of text mining techniques to handle semi-structured or unstructured data has been discussed in detail. A novel integration of different data and text mining tools has been proposed in the form of a framework in which knowledge discovery and its refinement processes are performed through the application of Clustering and Apriori Association Rule of Mining algorithms. Finally the hypothesis of acquiring better classification accuracies has been detailed through the application of the methodology on case study data available in the form of Post Project Reviews (PPRs) reports. The process of discovering useful knowledge, its interpretation and utilisation has been automated to classify the textual data into two classes. 020
117	An IoT Solution for Urban Noise Identification in Smart Cities : Noise Measurement and Classification Alsouda, Yasser January 2019 (has links) Noise is defined as any undesired sound. Urban noise and its effect on citizens area significant environmental problem, and the increasing level of noise has become a critical problem in some cities. Fortunately, noise pollution can be mitigated by better planning of urban areas or controlled by administrative regulations. However, the execution of such actions requires well-established systems for noise monitoring. In this thesis, we present a solution for noise measurement and classification using a low-power and inexpensive IoT unit. To measure the noise level, we implement an algorithm for calculating the sound pressure level in dB. We achieve a measurement error of less than 1 dB. Our machine learning-based method for noise classification uses Mel-frequency cepstral coefficients for audio feature extraction and four supervised classification algorithms (that is, support vector machine, k-nearest neighbors, bootstrap aggregating, and random forest). We evaluate our approach experimentally with a dataset of about 3000 sound samples grouped in eight sound classes (such as car horn, jackhammer, or street music). We explore the parameter space of the four algorithms to estimate the optimal parameter values for the classification of sound samples in the dataset under study. We achieve noise classification accuracy in the range of 88% – 94%. urban noise sound pressure level (SPL) internet of things (IoT) machine learning support vector machine (SVM) k-nearest neighbors (KNN) bootstrap aggregating (Bagging) random forest Elektroteknik och elektronik
118	Suivi d'objets d'intérêt dans une séquence d'images : des points saillants aux mesures statistiques Vincent, Garcia 11 December 2008 (has links) (PDF) Le problème du suivi d'objets dans une vidéo se pose dans des domaines tels que la vision par ordinateur (vidéo-surveillance par exemple) et la post-production télévisuelle et cinématographique (effets spéciaux). Il se décline en deux variantes principales : le suivi d'une région d'intérêt, qui désigne un suivi grossier d'objet, et la segmentation spatio-temporelle, qui correspond à un suivi précis des contours de l'objet d'intérêt. Dans les deux cas, la région ou l'objet d'intérêt doivent avoir été préalablement détourés sur la première, et éventuellement la dernière, image de la séquence vidéo. Nous proposons dans cette thèse une méthode pour chacun de ces types de suivi ainsi qu'une implémentation rapide tirant partie du Graphics Processing Unit (GPU) d'une méthode de suivi de régions d'intérêt développée par ailleurs.<br />La première méthode repose sur l'analyse de trajectoires temporelles de points saillants et réalise un suivi de régions d'intérêt. Des points saillants (typiquement des lieux de forte courbure des lignes isointensité) sont détectés dans toutes les images de la séquence. Les trajectoires sont construites en liant les points des images successives dont les voisinages sont cohérents. Notre contribution réside premièrement dans l'analyse des trajectoires sur un groupe d'images, ce qui améliore la qualité d'estimation du mouvement. De plus, nous utilisons une pondération spatio-temporelle pour chaque trajectoire qui permet d'ajouter une contrainte temporelle sur le mouvement tout en prenant en compte les déformations géométriques locales de l'objet ignorées par un modèle de mouvement global.<br />La seconde méthode réalise une segmentation spatio-temporelle. Elle repose sur l'estimation du mouvement du contour de l'objet en s'appuyant sur l'information contenue dans une couronne qui s'étend de part et d'autre de ce contour. Cette couronne nous renseigne sur le contraste entre le fond et l'objet dans un contexte local. C'est là notre première contribution. De plus, la mise en correspondance par une mesure de similarité statistique, à savoir l'entropie du résiduel, d'une portion de la couronne et d'une zone de l'image suivante dans la séquence permet d'améliorer le suivi tout en facilitant le choix de la taille optimale de la couronne.<br />Enfin, nous proposons une implémentation rapide d'une méthode de suivi de régions d'intérêt existante. Cette méthode repose sur l'utilisation d'une mesure de similarité statistique : la divergence de Kullback-Leibler. Cette divergence peut être estimée dans un espace de haute dimension à l'aide de multiples calculs de distances au k-ème plus proche voisin dans cet espace. Ces calculs étant très coûteux, nous proposons une implémentation parallèle sur GPU (grâce à l'interface logiciel CUDA de NVIDIA) de la recherche exhaustive des k plus proches voisins. Nous montrons que cette implémentation permet d'accélérer le suivi des objets, jusqu'à un facteur 15 par rapport à une implémentation de cette recherche nécessitant au préalable une structuration des données. [MATH] Mathematics Suivi d'objets point d'intérêt traitement d'images points saillants mesures statistiques GPU tracking entropie kullback-Leibler segmentation knn k plus proches voisins
119	On spectrum sensing, resource allocation, and medium access control in cognitive radio networks Karaputugala Gamacharige, Madushan Thilina 12 1900 (has links) The cognitive radio-based wireless networks have been proposed as a promising technology to improve the utilization of the radio spectrum through opportunistic spectrum access. In this context, the cognitive radios opportunistically access the spectrum which is licensed to primary users when the primary user transmission is detected to be absent. For opportunistic spectrum access, the cognitive radios should sense the radio environment and allocate the spectrum and power based on the sensing results. To this end, in this thesis, I develop a novel cooperative spectrum sensing scheme for cognitive radio networks (CRNs) based on machine learning techniques which are used for pattern classification. In this regard, unsupervised and supervised learning-based classification techniques are implemented for cooperative spectrum sensing. Secondly, I propose a novel joint channel and power allocation scheme for downlink transmission in cellular CRNs. I formulate the downlink resource allocation problem as a generalized spectral-footprint minimization problem. The channel assignment problem for secondary users is solved by applying a modified Hungarian algorithm while the power allocation subproblem is solved by using Lagrangian technique. Specifically, I propose a low-complexity modified Hungarian algorithm for subchannel allocation which exploits the local information in the cost matrix. Finally, I propose a novel dynamic common control channel-based medium access control (MAC) protocol for CRNs. Specifically, unlike the traditional dedicated control channel-based MAC protocols, the proposed MAC protocol eliminates the requirement of a dedicated channel for control information exchange. / October 2015 Cognitive radio networks Resource allocation Medium access control MAC protocol Spectrum sensing Machine learning Dynamic common control channel OFDMA Multiuser cellular
120	Dolování z dat v jazyce Python / Data Mining with Python Šenovský, Jakub January 2017 (has links) The main goal of this thesis was to get acquainted with the phases of data mining, with the support of the programming languages Python and R in the field of data mining and demonstration of their use in two case studies. The comparison of these languages in the field of data mining is also included. The data preprocessing phase and the mining algorithms for classification, prediction and clustering are described here. There are illustrated the most significant libraries for Python and R. In the first case study, work with time series was demonstrated using the ARIMA model and Neural Networks with precision verification using a Mean Square Error. In the second case study, the results of football matches are classificated using the K - Nearest Neighbors, Bayes Classifier, Random Forest and Logical Regression. The precision of the classification is displayed using Accuracy Score and Confusion Matrix. The work is concluded with the evaluation of the achived results and suggestions for the future improvement of the individual models.

Search results