Global ETD Search

1	Various considerations on performance measures for a classification of ordinal data Nyongesa, Denis Barasa 13 August 2016 (has links) <p> The technological advancement and the escalating interest in personalized medicine has resulted in increased ordinal classification problems. The most commonly used performance metrics for evaluating the effectiveness of a multi-class ordinal classifier include; predictive accuracy, Kendall's tau-b rank correlation, and the average mean absolute error (AMAE). These metrics are beneficial in the quest to classify multi-class ordinal data, but no single performance metric incorporates the misclassification cost. Recently, distance, which finds the optimal trade-off between the predictive accuracy and the misclassification cost was proposed as a cost-sensitive performance metric for ordinal data. This thesis proposes the criteria for variable selection and methods that accounts for minimum distance and improved accuracy, thereby providing a platform for a more comprehensive and comparative analysis of multiple ordinal classifiers. The strengths of our methodology are demonstrated through real data analysis of a colon cancer data set.</p>
2	Predicting National Basketball Association Game Outcomes Using Ensemble Learning Techniques Valenzuela, Russell 25 April 2019 (has links) <p> There have been a number of studies that try to predict sporting event outcomes. Most previous research has involved results in football and college basketball. Recent years has seen similar approaches carried out in professional basketball. This thesis attempts to build upon existing statistical techniques and apply them to the National Basketball Association using a synthesis of algorithms as motivation. A number of ensemble learning methods will be utilized and compared in hopes of improving the accuracy of single models. Individual models used in this thesis will be derived from Logistic Regression, Naïve Bayes, Random Forests, Support Vector Machines, and Artificial Neural Networks while aggregation techniques include Bagging, Boosting, and Stacking. Data from previous seasons and games from both?players and teams will be used to train models in R.</p><p> Statistics\|Artificial intelligence
3	Generative Probabilistic Models for Analysis of Communication Event Data with Applications to Email Behavior Navaroli, Nicholas Martin 28 January 2015 (has links) <p> Our daily lives increasingly involve interactions with others via different communication channels, such as email, text messaging, and social media. In this context, the ability to analyze and understand our communication patterns is becoming increasingly important. This dissertation focuses on generative probabilistic models for describing different characteristics of communication behavior, focusing primarily on email communication. </p><p> First, we present a two-parameter kernel density estimator for estimating the probability density over recipients of an email (or, more generally, items which appear in an itemset). A stochastic gradient method is proposed for efficiently inferring the kernel parameters given a continuous stream of data. Next, we apply the kernel model and the Bernoulli mixture model to two important prediction tasks: given a partially completed email recipient list, 1) predict which others will be included in the email, and 2) rank potential recipients based on their likelihood to be added to the email. Such predictions are useful in suggesting future actions to the user (i.e. which person to add to an email) based on their previous actions. We then investigate a piecewise-constant Poisson process model for describing the time-varying communication rate between an individual and several groups of their contacts, where changes in the Poisson rate are modeled as latent state changes within a hidden Markov model. </p><p> We next focus on the time it takes for an individual to respond to an event, such as receiving an email. We show that this response time depends heavily on the individual's typical daily and weekly patterns - patterns not adequately captured in standard models of response time (e.g. the Gamma distribution or Hawkes processes). A time-warping mechanism is introduced where the absolute response time is modeled as a transformation of effective response time, relative to the daily and weekly patterns of the individual. The usefulness of applying the time-warping mechanism to standard models of response time, both in terms of log-likelihood and accuracy in predicting which events will be quickly responded to, is illustrated over several individual email histories.</p>
4	Higher-order Random Walk Methods for Data Analysis Wu, Tao 14 June 2018 (has links) <p> Markov random walk models are powerful analytical tools for multiple areas in machine learning, numerical optimizations and data mining tasks. The key assumption of a first-order Markov chain is memorylessness, which restricts the dependence of the transition distribution to the current state only. However in many applications, this assumption is not appropriate. We propose a set of higher-order random walk techniques and discuss their applications to tensor co-clustering, user trails modeling, and solving linear systems. First, we develop a new random walk model that we call the super-spacey random surfer, which simultaneously clusters the rows, columns, and slices of a nonnegative three-mode tensor. This algorithm generalizes to tensors with any number of modes. We partition the tensor by minimizing the exit probability between clusters when the super-spacey random walk is at stationary. The second application is user trails modeling, where user trails record sequences of activities when individuals interact with the Internet and the world. We propose the retrospective higher-order Markov process as a two-step process by first choosing a state from the history and then transitioning as a first-order chain conditional on that state. This way the total number of parameters is restricted and thus the model is protected from overfitting. Lastly we propose to use a time-inhomogeneous Markov chain to approximate the solution of a linear system. Multiple simulations of the random walk are conducted to approximate the solution. By allowing the random walk to transition based on multiple matrices, we decrease the variance of the simulations, and thus increase the speed of the solver.</p><p>
5	Pachinko allocation: DAG-structured mixture models of topic correlations Li, Wei 01 January 2007 (has links) Statistical topic models are increasingly popular tools for summarization and manifold discovery in discrete data. However, the majority of existing approaches capture no or limited correlations between topics. We propose the pachinko allocation model (PAM), which captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). We present various structures within this framework, different parameterizations of topic distributions, and an extension to capture dynamic patterns of topic correlations. We also introduce a non-parametric Bayesian prior to automatically learn the topic structure from data. The model is evaluated on document classification, likelihood of held-out data, the ability to support fine-grained topics, and topical keyword coherence. With a highly-scalable approximation, PAM has also been applied to discover topic hierarchies in very large datasets.
6	Optimal learning: Computational procedures for Bayes -adaptive Markov decision processes Duff, Michael O'Gordon 01 January 2002 (has links) This dissertation considers a particular aspect of sequential decision making under uncertainty in which, at each stage, a decision-making agent operating in an uncertain world takes an action that elicits a reinforcement signal and causes the state of the world to change. Optimal learning is a pattern of behavior that yields the highest expected total reward over the entire duration of an agent's interaction with its uncertain world. The problem of determining an optimal learning strategy is a sort of meta-problem, with optimality defined with respect to a distribution of environments that the agent is likely to encounter. Given this prior uncertainty over possible environments, the optimal-learning agent must collect and use information in an intelligent way, balancing greedy exploitation of certainty-equivalent world models with exploratory actions aimed at discerning the true state of nature. My approach to approximating optimal learning strategies retains the full model of the sequential decision process that, in incorporating a Bayesian model for evolving uncertainty about unknown process parameters, takes the form of a Markov decision process defined over a set of “hyperstates” whose cardinality grows exponentially with the planning horizon. I develop computational procedures that retain the full Bayesian formulation, but sidestep intractability by utilizing techniques from reinforcement learning theory (specifically, Monte-Carlo simulation and the adoption of parameterized function approximators). By pursuing an approach that is grounded in a complete Bayesian world model, I develop algorithms that produce policies that exhibit performance gains over simple heuristics. Moreover, in contrast to many heuristics, the justification or legitimacy of the policies follows directly from the fact that they are clearly motivated by a complete characterization of the underlying decision problem to be solved. This dissertation's contributions include a reinforcement learning algorithm for estimating Gittins indices for multi-armed bandit problems, a Monte-Carlo gradient-based algorithm for approximating solutions to general problems of optimal learning, a gradient-based scheme for improving optimal learning policies instantiated as finite-state stochastic automata, and an investigation of diffusion processes as analytical models for evolving uncertainty.
7	Causal inference with instruments and other supplementary variables Ramsahai, Roland Ryan January 2008 (has links) Instrumental variables have been used for a long time in the econometrics literature for the identification of the causal effect of one random variable, B, on another, C, in the presence of unobserved confounders. In the classical continuous linear model, the causal effect can be point identified by studying the regression of C on A and B on A, where A is the instrument. An instrument is an instance of a supplementary variable which is not of interest in itself but aids identification of causal effects. The method of instrumental variables is extended here to generalised linear models, for which only bounds on the causal effect can be computed. For the discrete instrumental variable model, bounds have been derived in the literature for the causal effect of B on C in terms of the joint distribution of (A,B,C). Using an approach based on convex polytopes, bounds are computed here in terms of the pairwise (A,B) and (A,C) distributions, in direct analogy to the classic use but without the linearity assumption. The bounding technique is also adapted to instrumental models with stronger and weaker assumptions. The computation produces constraints which can be used to invalidate the model. In the literature, constraints of this type are usually tested by checking whether the relative frequencies satisfy them. This is unsatisfactory from a statistical point of view as it ignores the sampling uncertainty of the data. Given the constraints for a model, a proper likelihood analysis is conducted to develop a significance test for the validity of the instrumental model and a bootstrap algorithm for computing confidence intervals for the causal effect. Applications are presented to illustrate the methods and the advantage of a rigorous statistical approach. The use of covariates and intermediate variables for improving the efficiency of causal estimators is also discussed. 519
8	Novelty detection with extreme value theory in vital-sign monitoring Hugueny, Samuel Y. January 2013 (has links) Every year in the UK, tens of thousands of hospital patients suffer adverse events, such as un-planned transfers to Intensive Therapy Units or unexpected cardiac arrests. Studies have shown that in a large majority of cases, significant physiological abnormalities can be observed within the 24-hour period preceding such events. Such warning signs may go unnoticed, if they occur between observations by the nursing staff, or are simply not identified as such. Timely detection of these warning signs and appropriate escalation schemes have been shown to improve both patient outcomes and the use of hospital resources, most notably by reducing patients’ length of stay. Automated real-time early-warning systems appear to be cost-efficient answers to the need for continuous vital-sign monitoring. Traditionally, a limitation of such systems has been their sensitivity to noisy and artefactual measurements, resulting in false-alert rates that made them unusable in practice, or earned them the mistrust of clinical staff. Tarassenko et al. (2005) and Hann (2008) proposed a novelty detection approach to the problem of continuous vital-sign monitoring, which, in a clinical trial, was shown to yield clinically acceptable false alert rates. In this approach, an observation is compared to a data fusion model, and its “normality” assessed by comparing a chosen statistic to a pre-set threshold. The method, while informed by large amounts of training data, has a number of heuristic aspects. This thesis proposes a principled approach to multivariate novelty detection in stochastic time- series, where novelty scores have a probabilistic interpretation, and are explicitly linked to the starting assumptions made. Our approach stems from the observation that novelty detection using complex multivariate, multimodal generative models is generally an ill-defined problem when attempted in the data space. In situations where “novel” is equivalent to “improbable with respect to a probability distribution ”, formulating the problem in a univariate probability space allows us to use classical results of univariate statistical theory. Specifically, we propose a multivariate extension to extreme value theory and, more generally, order statistics, suitable for performing novelty detection in time-series generated from a multivariate, possibly multimodal model. All the methods introduced in this thesis are applied to a vital-sign monitoring problem and compared to the existing method of choice. We show that it is possible to outperform the existing method while retaining a probabilistic interpretation. In addition to their application to novelty detection for vital-sign monitoring, contributions in this thesis to existing extreme value theory and order statistics are also valid in the broader context of data-modelling, and may be useful for analysing data from other complex systems. 616.07

Search results