Global ETD Search

1	BCAP\| An Artificial Neural Network Pruning Technique to Reduce Overfitting Brantley, Kiante 23 July 2016 (has links) <p> Determining the optimal size of a neural network is complicated. Neural networks, with many free parameters, can be used to solve very complex problems. However, these neural networks are susceptible to overfitting. BCAP (Brantley-Clark Artificial Neural Network Pruning Technique) addresses overfitting by combining duplicate neurons in a neural network hidden layer, thereby forcing the network to learn more distinct features. We compare hidden units using the cosine similarity, and combine those that are similar with each other within a threshold ϵ. By doing so the co-adaption of the neurons in the network is reduced because hidden units that are highly correlated (i.e. similar) are combined. In this paper we show evidence that BCAP is successful in reducing network size while maintaining accuracy, or improving accuracy of neural networks during and after training.</p> Artificial intelligence\|Computer science
2	Integrating Exponential Dispersion Models to Latent Structures Basbug, Mehmet Emin 08 February 2017 (has links) <p> Latent variable models have two basic components: a latent structure encoding a hypothesized complex pattern and an observation model capturing the data distribution. With the advancements in machine learning and increasing availability of resources, we are able to perform inference in deeper and more sophisticated latent variable models. In most cases, these models are designed with a particular application in mind; hence, they tend to have restrictive observation models. The challenge, surfaced with the increasing diversity of data sets, is to generalize these latent models to work with different data types. We aim to address this problem by utilizing exponential dispersion models (EDMs) and proposing mechanisms for incorporating them into latent structures. (Abstract shortened by ProQuest.)</p> Artificial intelligence\|Computer science
3	Discovering credible events in near real time from social media streams Buntain, Cody 26 January 2017 (has links) <p>Recent reliance on social media platforms as major sources of news and information, both for journalists and the larger population and especially during times of crisis, motivate the need for better methods of identifying and tracking high-impact events in these social media streams. Social media's volume, velocity, and democratization of information (leading to limited quality controls) complicate rapid discovery of these events and one's ability to trust the content posted about these events. This dissertation addresses these complications in four stages, using Twitter as a model social platform. The first stage analyzes Twitter's response to major crises, specifically terrorist attacks in Western countries, showing these high-impact events do not significantly impact message or user volume. Instead, these events drive changes in Twitter's topic distribution, with conversation, retweets, and hashtags relevant to these events experiencing significant, rapid, and short-lived bursts in frequency. Furthermore, conversation participants tend to prefer information from local authorities/organizations/media over national or international sources, with accounts for local police or local newspapers often emerging as central in the networks of interaction. Building on these results, the second stage in this dissertation presents and evaluates a set of features that capture these topical bursts associated with crises by modeling bursts in frequency for individual tokens in the Twitter stream. The resulting streaming algorithm is capable of discovering notable moments across a series of major sports competitions using Twitter's public stream without relying on domain- or language-specific information or models. Furthermore, results demonstrate models trained on sporting competition data perform well when transferred to earthquake identification. This streaming algorithm is then extended in this dissertation's third stage to support real-time event tracking and summarization. This real-time algorithm leverages new distributed processing technology to operate at scale and is evaluated against a collection of other community-developed information retrieval systems, where it performs comparably. Further experiments also show this real-time burst detection algorithm can be integrated with these other information retrieval systems to increase overall performance. The final stage then investigates automated methods for evaluating credibility in social media streams by leveraging two existing data sets. These two data sets measure different types of credibility (veracity versus perception), and results show veracity is negatively correlated with the amount of disagreement in and length of a conversation, and perceptions of credibility are influenced by the amount of links to other pages, shared media about the event, and the number of verified users participating in the discussion. Contributions made across these four stages are then usable in the relatively new fields of computational journalism and crisis informatics, which seek to improve news gathering and crisis response by leveraging new technologies and data sources like machine learning and social media. Artificial intelligence\|Computer science
4	Data-driven computer vision for science and the humanities Lee, Stefan 05 November 2016 (has links) <p> The rate at which humanity is producing visual data from both large-scale scientific imaging and consumer photography has been greatly accelerating in the past decade. This thesis is motivated by the hypothesis that this trend will necessarily change the face of observational science and the humanities, requiring the development of automated methods capable of distilling vast image collections to produce meaningful analyses. Such methods are needed to empower novel science both by improving throughput in traditionally quantitative disciplines and by developing new techniques to study culture through large scale image datasets.</p><p> When computer vision or machine learning in general is leveraged to aid academic inquiry, it is important to consider the impact of erroneous solutions produced by implicit ambiguity or model approximations. To that end, we argue for the importance of algorithms that are capable of generating multiple solutions and producing measures of confidence. In addition to providing solutions to a number of multi-disciplinary problems, this thesis develops techniques to address these overarching themes of confidence estimation and solution diversity. </p><p> This thesis investigates a diverse set of problems across a broad range of studies including glaciology, developmental psychology, architectural history, and demography to develop and adapt computer vision algorithms to solve these domain-specific applications. We begin by proposing vision techniques for automatically analyzing aerial radar imagery of polar ice sheets while simultaneously providing glaciologists with point-wise estimates of solution confidence. We then move to psychology, introducing novel recognition techniques to produce robust hand localizations and segmentations in egocentric video to empower psychologists studying child development with automated annotations of grasping behaviors integral to learning. We then investigate novel large-scale analysis for architectural history, leveraging tens of thousands of publicly available images to identify and track distinctive architectural elements. Finally, we show how rich estimates of demographic and geographic properties can be predicted from a single photograph.</p> Artificial intelligence\|Computer science
5	Semi-Supervised Learning for Electronic Phenotyping in Support of Precision Medicine Halpern, Yonatan 15 December 2016 (has links) <p> Medical informatics plays an important role in precision medicine, delivering the right information to the right person, at the right time. With the introduction and widespread adoption of electronic medical records, in the United States and world-wide, there is now a tremendous amount of health data available for analysis.</p><p> Electronic record phenotyping refers to the task of determining, from an electronic medical record entry, a concise descriptor of the patient, comprising of their medical history, current problems, presentation, etc. In inferring such a phenotype descriptor from the record, a computer, in a sense, "understands'' the relevant parts of the record. These phenotypes can then be used in downstream applications such as cohort selection for retrospective studies, real-time clinical decision support, contextual displays, intelligent search, and precise alerting mechanisms.</p><p> We are faced with three main challenges:</p><p> First, the unstructured and incomplete nature of the data recorded in the electronic medical records requires special attention. Relevant information can be missing or written in an obscure way that the computer does not understand. </p><p> Second, the scale of the data makes it important to develop efficient methods at all steps of the machine learning pipeline, including data collection and labeling, model learning and inference.</p><p> Third, large parts of medicine are well understood by health professionals. How do we combine the expert knowledge of specialists with the statistical insights from the electronic medical record?</p><p> Probabilistic graphical models such as Bayesian networks provide a useful abstraction for quantifying uncertainty and describing complex dependencies in data. Although significant progress has been made over the last decade on approximate inference algorithms and structure learning from complete data, learning models with incomplete data remains one of machine learning’s most challenging problems. How can we model the effects of latent variables that are not directly observed?</p><p> The first part of the thesis presents two different structural conditions under which learning with latent variables is computationally tractable. The first is the "anchored'' condition, where every latent variable has at least one child that is not shared by any other parent. The second is the "singly-coupled'' condition, where every latent variable is connected to at least three children that satisfy conditional independence (possibly after transforming the data). </p><p> Variables that satisfy these conditions can be specified by an expert without requiring that the entire structure or its parameters be specified, allowing for effective use of human expertise and making room for statistical learning to do some of the heavy lifting. For both the anchored and singly-coupled conditions, practical algorithms are presented.</p><p> The second part of the thesis describes real-life applications using the anchored condition for electronic phenotyping. A human-in-the-loop learning system and a functioning emergency informatics system for real-time extraction of important clinical variables are described and evaluated.</p><p> The algorithms and discussion presented here were developed for the purpose of improving healthcare, but are much more widely applicable, dealing with the very basic questions of identifiability and learning models with latent variables - a problem that lies at the very heart of the natural and social sciences.</p> Artificial intelligence\|Computer science
6	An evolutionary method for training autoencoders for deep learning networks Lander, Sean 18 November 2016 (has links) <p> Introduced in 2006, Deep Learning has made large strides in both supervised an unsupervised learning. The abilities of Deep Learning have been shown to beat both generic and highly specialized classification and clustering techniques with little change to the underlying concept of a multi-layer perceptron. Though this has caused a resurgence of interest in neural networks, many of the drawbacks and pitfalls of such systems have yet to be addressed after nearly 30 years: speed of training, local minima and manual testing of hyper-parameters.</p><p> In this thesis we propose using an evolutionary technique in order to work toward solving these issues and increase the overall quality and abilities of Deep Learning Networks. In the evolution of a population of autoencoders for input reconstruction, we are able to abstract multiple features for each autoencoder in the form of hidden nodes, scoring the autoencoders based on their ability to reconstruct their input, and finally selecting autoencoders for crossover and mutation with hidden nodes as the chromosome. In this way we are able to not only quickly find optimal abstracted feature sets but also optimize the structure of the autoencoder to match the features being selected. This also allows us to experiment with different training methods in respect to data partitioning and selection, reducing overall training time drastically for large and complex datasets. This proposed method allows even large datasets to be trained quickly and efficiently with little manual parameter choice required by the user, leading to faster, more accurate creation of Deep Learning Networks.</p> Artificial intelligence\|Computer science
7	Automated Feature Engineering for Deep Neural Networks with Genetic Programming Heaton, Jeff 19 April 2017 (has links) <p> Feature engineering is a process that augments the feature vector of a machine learning model with calculated values that are designed to enhance the accuracy of a model's predictions. Research has shown that the accuracy of models such as deep neural networks, support vector machines, and tree/forest-based algorithms sometimes benefit from feature engineering. Expressions that combine one or more of the original features usually create these engineered features. The choice of the exact structure of an engineered feature is dependent on the type of machine learning model in use. Previous research demonstrated that various model families benefit from different types of engineered feature. Random forests, gradient-boosting machines, or other tree-based models might not see the same accuracy gain that an engineered feature allowed neural networks, generalized linear models, or other dot-product based models to achieve on the same data set. </p><p> This dissertation presents a genetic programming-based algorithm that automatically engineers features that increase the accuracy of deep neural networks for some data sets. For a genetic programming algorithm to be effective, it must prioritize the search space and efficiently evaluate what it finds. This dissertation algorithm faced a potential search space composed of all possible mathematical combinations of the original feature vector. Five experiments were designed to guide the search process to efficiently evolve good engineered features. The result of this dissertation is an automated feature engineering (AFE) algorithm that is computationally efficient, even though a neural network is used to evaluate each candidate feature. This approach gave the algorithm a greater opportunity to specifically target deep neural networks in its search for engineered features that improve accuracy. Finally, a sixth experiment empirically demonstrated the degree to which this algorithm improved the accuracy of neural networks on data sets augmented by the algorithm's engineered features. </p> Artificial intelligence\|Computer science
8	Skald\| Exploring Story Generation and Interactive Storytelling by Reconstructing Minstrel Tearse, Brandon 16 February 2019 (has links) <p> Within the realm of computational story generation sits Minstrel, a decades old system which was once used to explore the idea that, under the correct conditions, novel stories can be generated by taking an existing story and replacing some of its elements with similar ones found in a different story. This concept would eventually fall within the bounds of a strategy known as Case-Based Reasoning (CBR), in which problems are solved by recalling solutions to past problems (the cases), and mutating the recalled cases in order to create an appropriate solution to the current problem. This dissertation uses a rational reconstruction of Minstrel called Minstrel Remixed, a handful of upgraded variants of Minstrel Remixed, and a pair of similar but unrelated storytelling systems, to explore various characteristics of Minstrel-style storytelling systems. </p><p> In the first part of this dissertation I define the class of storytelling systems that are similar to Minstrel. This definition allows me to compare the features of these systems and discuss the various strengths and weaknesses of the variants. Furthermore, I briefly describe the rational reconstruction of Minstrel and then provide a detailed overview of the inner workings of the resulting system, Minstrel Remixed. </p><p> Once Minstrel Remixed was complete, I chose to upgrade it in order to explore the set of stories that it could produced and ways to alter or reconfigure the system with the goal of intentionally influencing the set of possible outputs. This investigation resulted in two new storytelling systems called Conspiracy Forever and Problem Planets. The second portion of this dissertation discusses these systems as well as a number of discoveries about the strengths and weaknesses of Minstrel Style Storytelling Systems in general. More specifically, I discuss that, 1) a human reader's capacity for creating patterns out of an assortment of statements is incredibly useful and output should be crafted to use this potential, 2) Minstrel-Style Storytelling tends to be amnesiac and do a poor job of creating long stories that remain cohesive, and 3) the domain that a storytelling system is working from is incredibly important and must be well engineered. I continue by discussing the methods that I discovered for cleaning up and maintaining a domain and conclude with a section covering interviews with other storytelling system creators about the strengths and weaknesses of their systems in light of my findings about Minstrel Remixed. </p><p> In the final portion of this document I create a framework of six interrelated attributes of stories (length, coherence, creativity, complexity, contextuality, and consolidation,) and use this along with the learning discussed in the first two portions of the dissertation to discuss the strengths and weaknesses of this class of CBR systems when applied to both static story generation and interactive storytelling. I discuss the finding that these systems seem to have some amount of power and although they can be tweaked to produce for example, longer or more consolidated stories, these improvements always come along with a reduction in complexity, coherence, or one of the other attributes. Further discussion of the output power of this class of storytelling systems revolves around the primary limiting factor to their potential, namely the fact that they have no understanding of the symbols and patterns that they are manipulating. Finally, I introduce a number of strategies that I found to be fruitful for increasing the 'output power' of the system and getting around the lack of commonsense reasoning, chiefly improving the domain and adding new subsystems.</p><p> Artificial intelligence\|Computer science
9	Spiking Neural Networks and Sparse Deep Learning Tavanaei, Amirhossein 23 March 2019 (has links) <p> This document proposes new methods for training multi-layer and deep spiking neural networks (SNNs), specifically, spiking convolutional neural networks (CNNs). Training a multi-layer spiking network poses difficulties because the output spikes do not have derivatives and the commonly used backpropagation method for non-spiking networks is not easily applied. Our methods use novel versions of the brain-like, local learning rule named spike-timing-dependent plasticity (STDP) that incorporates supervised and unsupervised components. Our method starts with conventional learning methods and converts them to spatio-temporally local rules suited for SNNs. </p><p> The training uses two components for unsupervised feature extraction and supervised classification. The first component refers to new STDP rules for spike-based representation learning that trains convolutional filters and initial representations. The second introduces new STDP-based supervised learning rules for spike pattern classification via an approximation to gradient descent by combining the STDP and anti-STDP rules. Specifically, the STDP-based supervised learning model approximates gradient descent by using temporally local STDP rules. Stacking these components implements a novel sparse, spiking deep learning model. Our spiking deep learning model is categorized as a variation of spiking CNNs of integrate-and-fire (IF) neurons with performance comparable with the state-of-the-art deep SNNs. The experimental results show the success of the proposed model for image classification. Our network architecture is the only spiking CNN which provides bio-inspired STDP rules in a hierarchy of feature extraction and classification in an entirely spike-based framework.</p><p> Artificial intelligence\|Computer science
10	Communicating Plans in Ad Hoc Multiagent Teams Santarra, Trevor 16 April 2019 (has links) <p> With the rising use of autonomous agents within robotic and software settings, agents may be required to cooperate in teams while having little or no information regarding the capabilities of their teammates. In these ad hoc settings, teams must collaborate on the fly, having no prior opportunity for coordination. Prior research in this area commonly either assumes that communication between agents is impossible given their heterogeneous design or has left communication as an open problem. Typically, to accurately predict a teammate's behavior at a future point in time, ad hoc agents leverage models learned from past experience and attempt to infer a teammate's intended strategy through observing its current course of action. However, these approaches can fail to arrive at accurate policy predictions, leaving the coordinating agent uncertain and unable to adapt to its teammates' plans. We introduce the problem of communicating minimal sets of teammate policies in order to provide information for collaboration in such ad hoc environments. We demonstrate how an agent may determine what information it should solicit from its peers but further illustrate how optimal solutions to such a problem have intractable computational requirements. Nonetheless, through the characterization of this difficulty, we identify strategies that permit approximate or heuristic approaches, allowing the practical application of this capacity in ad hoc teams.</p><p> Artificial intelligence\|Computer science

Page generated in 0.2537 seconds