Global ETD Search

921	Machine Learning for Inspired, Structured, Lyrical Music Composition Bodily, Paul Mark 01 July 2018 (has links) Computational creativity has been called the "final frontier" of artificial intelligence due to the difficulty inherent in defining and implementing creativity in computational systems. Despite this difficulty computer creativity is becoming a more significant part of our everyday lives, in particular music. This is observed in the prevalence of music recommendation systems, co-creational music software packages, smart playlists, and procedurally-generated video games. Significant progress can be seen in the advances in industrial applications such as Spotify, Pandora, Apple Music, etc., but several problems persist. Of more general interest, however, is the question of whether or not computers can exhibit autonomous creativity in music composition. One of the primary challenges in this endeavor is enabling computational systems to create music that exhibits global structure, that can learn structure from data, and which can effectively incorporate autonomy and intention. We seek to address these challenges in the context of a modular machine learning framework called hierarchical Bayesian program learning (HBPL). Breaking the problem of music composition into smaller pieces, we focus primarily on developing machine learning models that solve the problems related to structure. In particular we present an adaptation of non-homogenous Markov models that enable binary constraints and we present a structural learning model, the multiple Smith-Waterman (mSW) alignment method, which extends sequence alignment techniques from bioinformatics. To address the issue of intention, we incorporate our work on structured sequence generation into a full-fledged computational creative system called Pop* which we show through various evaluative means to possess to varying extents the characteristics of creativity and also creativity itself. Machine Learning Markov Processes Constraint Satisfaction Computational Creativity Computer Sciences
922	Failure Prediction using Machine Learning in a Virtualized HPC System and application Mohammed, Bashir, Awan, Irfan U., Ugail, Hassan, Muhammad, Y. January 2019 (has links) Yes / Failure is an increasingly important issue in high performance computing and cloud systems. As large-scale systems continue to grow in scale and complexity, mitigating the impact of failure and providing accurate predictions with sufficient lead time remains a challenging research problem. Traditional existing fault-tolerance strategies such as regular checkpointing and replication are not adequate because of the emerging complexities of high performance computing systems. This necessitates the importance of having an effective as well as proactive failure management approach in place aimed at minimizing the effect of failure within the system. With the advent of machine learning techniques, the ability to learn from past information to predict future pattern of behaviours makes it possible to predict potential system failure more accurately. Thus, in this paper, we explore the predictive abilities of machine learning by applying a number of algorithms to improve the accuracy of failure prediction. We have developed a failure prediction model using time series and machine learning, and performed comparison based tests on the prediction accuracy. The primary algorithms we considered are the Support Vector Machine (SVM), Random Forest(RF), k-Nearest Neighbors (KNN), Classi cation and Regression Trees (CART) and Linear Discriminant Analysis (LDA). Experimental results show that the average prediction accuracy of our model using SVM when predicting failure is 90% accurate and effective compared to other algorithms. This fi nding means that our method can effectively predict all possible future system and application failures within the system. / The full-text of this article will be released for public view a year after publication.
923	A Hierarchical Multi-Output Nearest Neighbor Model for Multi-Output Dependence Learning Morris, Richard Glenn 08 March 2013 (has links) Multi-Output Dependence (MOD) learning is a generalization of standard classification problems that allows for multiple outputs that are dependent on each other. A primary issue that arises in the context of MOD learning is that for any given input pattern there can be multiple correct output patterns. This changes the learning task from function approximation to relation approximation. Previous algorithms do not consider this problem, and thus cannot be readily applied to MOD problems. To perform MOD learning, we introduce the Hierarchical Multi-Output Nearest Neighbor model (HMONN) that employs a basic learning model for each output and a modified nearest neighbor approach to refine the initial results. This paper focuses on tasks with nominal features, although HMONN has the initial capacity for solving MOD problems with real-valued features. Results obtained using UCI repository, synthetic, and business application data sets show improved accuracy over a baseline that treats each output as independent of all the others, with HMONN showing improvement that is statistically significant in the majority of cases. Multi-Output Dependence Machine Learning KNN Computer Sciences
924	Sistema predictivo progresivo de clasificación probabilística como guía para el aprendizaje Villagrá-Arnedo, Carlos-José 22 January 2016 (has links) El trabajo realizado en esta tesis está basado en el desarrollo de un modelo de predicción progresiva que mejora el proceso de enseñanza-aprendizaje a través del uso de las tecnologías de la información y, en particular, de las técnicas de inteligencia artificial. Este modelo tiene como base un sistema interactivo gamificado que gestiona las prácticas de la asignatura Matemáticas I, en las que se aprende razonamiento lógico a través de un videojuego llamado PLMan, muy similar al comecocos (PacMan). Los estudiantes acceden durante el curso a este sistema y van progresando y acumulando nota en las prácticas de la asignatura mediante la resolución de mapas del videojuego PLMan. Los datos procedentes de la interacción de los estudiantes con el sistema gamificado se registran en una base de datos. A partir de estos, se extraen unas características representativas del estado de los estudiantes, consistentes en datos de uso del sistema y resultados de aprendizaje. El modelo usa la técnica de Machine Learning SVM, y obtiene como resultado la clasificación semanal de los estudiantes en forma de probabilidad de que se encuentren en cada una de tres posibles clases: rendimiento alto, normal y bajo, acumulando los datos recogidos hasta la semana en curso. Se han realizado experimentos con los datos recogidos durante el curso 2014/15, correspondientes a 336 estudiantes, obteniendo buenos resultados en cuanto a la precisión del algoritmo SVM propuesto. A continuación, se ha realizado un análisis exhaustivo de la correlación de las características empleadas con la nota final, extrayendo las que presentan una mayor relación lineal con esta última. Después, se ha realizado un nuevo experimento empleando sólo estas características seleccionadas, obteniendo unos resultados similares aunque ligeramente inferiores a los de la experiencia inicial, lo que denota que pueden existir relaciones no lineales entre las variables que la técnica SVM puede detectar. Por último, el modelo planteado presenta los resultados obtenidos de forma que proporcionen información valiosa para profesores y estudiantes. Esta información se muestra en forma de gráficas fáciles de interpretar, constituyendo un mecanismo que permite detectar estudiantes que están en riesgo de fracasar y, en cualquier caso, permite guiarlos para que obtengan el máximo rendimiento. En definitiva, se trata de un modelo de predicción del rendimiento del estudiante con dos aportaciones principales: clasificación en tres clases con valores de probabilidad y de forma progresiva, y la información visual en forma de gráficas, que representan un mecanismo de guía para la mejora del proceso de enseñanza-aprendizaje. Aprendizaje Predicción Machine learning Progresión Clasificación
925	Clicking using the eyes, a machine learning approach. Stenström, Albin January 2015 (has links) This master thesis report describes the work of evaluating the approach of using an eye-tracker and machine learning to generate an interaction model for clicks. In the study, recordings were done from 10 participants using a quiz application, and machine learning was then applied. Models were created with varying quality from a machine learning view, although most models did not work well for interaction. One model was created that enable correct interaction 80\% of the time, although the specific circumstances for success were not identified. The conclusion of the thesis is that the approach works in some cases, but that more research needs to be done to evaluate general suitability, and approaches to make it work reliably. Eye-tracking interaction click machine-learning Computer Engineering Datorteknik
926	Machine learning approaches for predicting genotype from phenotype and a novel clustering technique for subgenotype discovery: an application to inherited deafness Taylor, Kyle Ross 01 July 2014 (has links) This thesis describes a method, software tool, and web-based service called AudioGene, which can be used to predict genotype from phenotype in patients with inherited forms of hearing loss. To enhance the effectiveness of this prediction facility, a novel clustering technique was developed called Hierarchal Surface Clustering (HSC), which allows existing phenotype data to drive the discovery of new disease subtypes and their genotypes. The accuracy of AudioGene for predicting the top three candidate loci was 68% when using a multi-instance support vector machine, compared to 44% using a Majority classifier for Autosomal Dominant Non-syndromic Hearing loss (ADNSHL). The method was extended to predict the mutation type for patients with mutations in the Autosomal Recessive Non-syndromic Hearing Loss locus DFNB1, and had an accuracy of 83% compared to 50% for a Majority classifier. Along with HSC, a novel visualization technique was developed to plot the progression of the hearing loss with age in 3D as surfaces. Simulated datasets were used along with actual clinical data to evaluate the performance of HSC and compare it to other clustering techniques. When evaluating using the clinical data, HSC had the highest Adjusted Rand Index with a value of 0.459 compared to 0.187 for spectral clustering and 0.103 for K-means clustering. Bioinformatics Hearing Loss Machine Learning Electrical and Computer Engineering
927	Learning better physics: a machine learning approach to lattice gauge theory Foreman, Samuel Alfred 01 August 2018 (has links) In this work we explore how lattice gauge theory stands to benefit from new developments in machine learning, and look at two specific examples that illustrate this point. We begin with a brief overview of selected topics in machine learning for those who may be unfamiliar, and provide a simple example that helps to show how these ideas are carried out in practice. After providing the relevant background information, we then introduce an example of renormalization group (RG) transformations, inspired by the tensor RG, that can be used for arbitrary image sets, and look at applying this idea to equilibrium configurations of the two-dimensional Ising model. The second main idea presented in this thesis involves using machine learning to improve the efficiency of Markov Chain Monte Carlo (MCMC) methods. Explicitly, we describe a new technique for performing Hamiltonian Monte Carlo (HMC) simulations using an alternative leapfrog integrator that is parameterized by weights in a neural network. This work is based on the L2HMC ('Learning to Hamiltonian Monte Carlo') algorithm introduced in [1]. ising lattice machine learning monte carlo renormalization group simulation Physics
928	Computational methods for identification of disease-associated variations in exome sequencing Wagner, Alex Handler 01 December 2014 (has links) The explosive growth in the ability to sequence DNA due to next-generation sequencing (NGS) technologies has brought an unprecedented ability to characterize an individual's exome inexpensively. This ability provides clinicians with additional tools to evaluate the likely genetic factors underlying heritable diseases. With this added capacity comes a need to identify relationships between the genetic variations observed in a patient and the disease with which the patient presents. This dissertation focuses on computational techniques to inform molecular diagnostics from NGS data. The techniques focus on three distinct domains in the characterization of disease-associated variants from exome sequencing. First, strategies for producing complete and non-artifactual candidate variant lists are discussed. The process of converting patient DNA to a list of variants from the reference genome is very complex, and numerous modes of error may be introduced during the process. For this, a Random Forest classifier was built to capture biases in a laboratory variant calling pipeline, and a C4.5 decision tree was built to enable discovery of thresholds for false positive reduction. Additionally, a strategy for augmenting exome capture experiments through evaluation of RNA-sequencing is discussed. Second, a novel positive and unlabeled learning for prioritization (PULP) strategy is proposed to identify candidate variants most likely to be associated with a patient's disease. Using a number of publicly available data sources, PULP ranks genes according to how alike they are to previously discovered disease genes. This strategy is evaluated on a number of candidate lists from the literature, and demonstrated to significantly enrich ordered candidate variants lists for likely disease-associated variants. Finally, the Training for Recognition and Integration of Phenotypes in Ocular Disease (TRIPOD) web utility is introduced as a means of simultaneously training and learning from clinicians about heritable ocular diseases. This tool currently contains a number of case studies documenting a wide range of diseases, and challenges trainees to virtually diagnose patients based on presented image data. Annotations by trainee and expert alike are used to construct rich phenotypic profiles for patients with known disease genotypes. The strategies presented in this dissertation are specifically applicable to heritable retinal dystrophies, and have resulted in a number of improvements to the accurate molecular diagnosis of patient diseases. However, these works also provide a generalizable framework for disease-associated variant identification in any heritable, genetically heterogeneous disease, and represent the ongoing challenge of accurate diagnosis in the information age. publicabstract Bioinformatics Machine Learning PULP TRIPOD Computer Sciences Genetics
929	Reliability of Technical Stock Price Pattern Predictability Lutey, Matthew 05 August 2019 (has links) Academic research has shown throughout the years the ability of technical indicators to convey predictive value, informational content, and practical use. The popularity of such studies goes in and out over the years and today is being recognized widely by behavioral economists. Automated technical analysis is said to detect geometric and nonlinear shapes in prices which ordinary time series methods would be unable to detect. Previous papers use smoothing estimators to detect such patterns. Our paper uses local polynomial regressions, digital image processing, and state of the art machine learning tools to detect the patterns. Our results show that they are nonrandom, convey informational value, and have some predictive ability. We validate our results with prior works using stocks from the Dow Jones Industrial Average for a sample period from 1925-2019 using daily price observations. crsp finance technical analysis charting machine learning Finance and Financial Management
930	Big Data Analytics and Engineering for Medicare Fraud Detection Unknown Date (has links) The United States (U.S.) healthcare system produces an enormous volume of data with a vast number of financial transactions generated by physicians administering healthcare services. This makes healthcare fraud difficult to detect, especially when there are considerably less fraudulent transactions than non-fraudulent. Fraud is an extremely important issue for healthcare, as fraudulent activities within the U.S. healthcare system contribute to significant financial losses. In the U.S., the elderly population continues to rise, increasing the need for programs, such as Medicare, to help with associated medical expenses. Unfortunately, due to healthcare fraud, these programs are being adversely affected, draining resources and reducing the quality and accessibility of necessary healthcare services. In response, advanced data analytics have recently been explored to detect possible fraudulent activities. The Centers for Medicare and Medicaid Services (CMS) released several ‘Big Data’ Medicare claims datasets for different parts of their Medicare program to help facilitate this effort. In this dissertation, we employ three CMS Medicare Big Data datasets to evaluate the fraud detection performance available using advanced data analytics techniques, specifically machine learning. We use two distinct approaches, designated as anomaly detection and traditional fraud detection, where each have very distinct data processing and feature engineering. Anomaly detection experiments classify by provider specialty, determining whether outlier physicians within the same specialty signal fraudulent behavior. Traditional fraud detection refers to the experiments directly classifying physicians as fraudulent or non-fraudulent, leveraging machine learning algorithms to discriminate between classes. We present our novel data engineering approaches for both anomaly detection and traditional fraud detection including data processing, fraud mapping, and the creation of a combined dataset consisting of all three Medicare parts. We incorporate the List of Excluded Individuals and Entities database to identify real world fraudulent physicians for model evaluation. Regarding features, the final datasets for anomaly detection contain only claim counts for every procedure a physician submits while traditional fraud detection incorporates aggregated counts and payment information, specialty, and gender. Additionally, we compare cross-validation to the real world application of building a model on a training dataset and evaluating on a separate test dataset for severe class imbalance and rarity. / Includes bibliography. / Dissertation (Ph.D.)--Florida Atlantic University, 2019. / FAU Electronic Theses and Dissertations Collection Big data Medicare fraud Data analytics Machine learning

Search results