Global ETD Search

21	Defining and predicting fast-selling clothing options Jesperson, Sara January 2019 (has links) This thesis aims to find a definition of fast-selling clothing options and to find a way to predict them using only a few weeks of sale data as input. The data used for this project contain daily sales and intake quantity for seasonal options, with sale start 2016-2018, provided by the department store chain Åhléns. A definition is found to describe fast-selling clothing options as those having sold a certain percentage of their intake after a fixed number of days. An alternative definition based on cluster affiliation is proven less effective. Two predictive models are tested, the first one being a probabilistic classifier and the second one being a k-nearest neighbor classifier, using the Euclidean distance. The probabilistic model is divided into three steps: transformation, clustering, and classification. The time series are transformed with B-splines to reduce dimensionality, where each time series is represented by a vector with its length and B-spline coefficients. As a tool to improve the quality of the predictions, the B-spline vectors are clustered with a Gaussian mixture model where every cluster is assigned one of the two labels fast-selling or ordinary, thus dividing the clusters into disjoint sets: one containing fast-selling clusters and the other containing ordinary clusters. Lastly, the time series to be predicted are assumed to be Laplace distributed around a B-spline and using the probability distributions provided by the clustering, the posterior probability for each class is used to classify the new observations. In the transformation step, the number of knots for the B-splines are evaluated with cross-validation and the Gaussian mixture models, from the clustering step, are evaluated with the Bayesian information criterion, BIC. The predictive performance of both classifiers is evaluated with accuracy, precision, and recall. The probabilistic model outperforms the k-nearest neighbor model with considerably higher values of accuracy, precision, and recall. The performance of each model is improved by using more data to make the predictions, most prominently with the probabilistic model. Apparel retail B-splines Gaussian mixture model Probabilistic prediction Sales prediction Probability Theory and Statistics Sannolikhetsteori och statistik
22	An incremental gaussian mixture network for data stream classification in non-stationary environments / Uma rede de mistura de gaussianas incrementais para classificação de fluxos contínuos de dados em cenários não estacionários Diaz, Jorge Cristhian Chamby January 2018 (has links) Classificação de fluxos contínuos de dados possui muitos desafios para a comunidade de mineração de dados quando o ambiente não é estacionário. Um dos maiores desafios para a aprendizagem em fluxos contínuos de dados está relacionado com a adaptação às mudanças de conceito, as quais ocorrem como resultado da evolução dos dados ao longo do tempo. Duas formas principais de desenvolver abordagens adaptativas são os métodos baseados em conjunto de classificadores e os algoritmos incrementais. Métodos baseados em conjunto de classificadores desempenham um papel importante devido à sua modularidade, o que proporciona uma maneira natural de se adaptar a mudanças de conceito. Os algoritmos incrementais são mais rápidos e possuem uma melhor capacidade anti-ruído do que os conjuntos de classificadores, mas têm mais restrições sobre os fluxos de dados. Assim, é um desafio combinar a flexibilidade e a adaptação de um conjunto de classificadores na presença de mudança de conceito, com a simplicidade de uso encontrada em um único classificador com aprendizado incremental. Com essa motivação, nesta dissertação, propomos um algoritmo incremental, online e probabilístico para a classificação em problemas que envolvem mudança de conceito. O algoritmo é chamado IGMN-NSE e é uma adaptação do algoritmo IGMN. As duas principais contribuições da IGMN-NSE em relação à IGMN são: melhoria de poder preditivo para tarefas de classificação e a adaptação para alcançar um bom desempenho em cenários não estacionários. Estudos extensivos em bases de dados sintéticas e do mundo real demonstram que o algoritmo proposto pode rastrear os ambientes em mudança de forma muito próxima, independentemente do tipo de mudança de conceito. / Data stream classification poses many challenges for the data mining community when the environment is non-stationary. The greatest challenge in learning classifiers from data stream relates to adaptation to the concept drifts, which occur as a result of changes in the underlying concepts. Two main ways to develop adaptive approaches are ensemble methods and incremental algorithms. Ensemble method plays an important role due to its modularity, which provides a natural way of adapting to change. Incremental algorithms are faster and have better anti-noise capacity than ensemble algorithms, but have more restrictions on concept drifting data streams. Thus, it is a challenge to combine the flexibility and adaptation of an ensemble classifier in the presence of concept drift, with the simplicity of use found in a single classifier with incremental learning. With this motivation, in this dissertation we propose an incremental, online and probabilistic algorithm for classification as an effort of tackling concept drifting. The algorithm is called IGMN-NSE and is an adaptation of the IGMN algorithm. The two main contributions of IGMN-NSE in relation to the IGMN are: predictive power improvement for classification tasks and adaptation to achieve a good performance in non-stationary environments. Extensive studies on both synthetic and real-world data demonstrate that the proposed algorithm can track the changing environments very closely, regardless of the type of concept drift. Banco : Dados Algoritmos Incremental learning Gaussian mixture models Concept drift Data streams classification
23	Continuous reinforcement learning with incremental Gaussian mixture models / Aprendizagem por reforço contínua com modelos de mistura gaussianas incrementais Pinto, Rafael Coimbra January 2017 (has links) A contribução original desta tese é um novo algoritmo que integra um aproximador de funções com alta eficiência amostral com aprendizagem por reforço em espaços de estados contínuos. A pesquisa completa inclui o desenvolvimento de um algoritmo online e incremental capaz de aprender por meio de uma única passada sobre os dados. Este algoritmo, chamado de Fast Incremental Gaussian Mixture Network (FIGMN) foi empregado como um aproximador de funções eficiente para o espaço de estados de tarefas contínuas de aprendizagem por reforço, que, combinado com Q-learning linear, resulta em performance competitiva. Então, este mesmo aproximador de funções foi empregado para modelar o espaço conjunto de estados e valores Q, todos em uma única FIGMN, resultando em um algoritmo conciso e com alta eficiência amostral, i.e., um algoritmo de aprendizagem por reforço capaz de aprender por meio de pouquíssimas interações com o ambiente. Um único episódio é suficiente para aprender as tarefas investigadas na maioria dos experimentos. Os resultados são analisados a fim de explicar as propriedades do algoritmo obtido, e é observado que o uso da FIGMN como aproximador de funções oferece algumas importantes vantagens para aprendizagem por reforço em relação a redes neurais convencionais. / This thesis’ original contribution is a novel algorithm which integrates a data-efficient function approximator with reinforcement learning in continuous state spaces. The complete research includes the development of a scalable online and incremental algorithm capable of learning from a single pass through data. This algorithm, called Fast Incremental Gaussian Mixture Network (FIGMN), was employed as a sample-efficient function approximator for the state space of continuous reinforcement learning tasks, which, combined with linear Q-learning, results in competitive performance. Then, this same function approximator was employed to model the joint state and Q-values space, all in a single FIGMN, resulting in a concise and data-efficient algorithm, i.e., a reinforcement learning algorithm that learns from very few interactions with the environment. A single episode is enough to learn the investigated tasks in most trials. Results are analysed in order to explain the properties of the obtained algorithm, and it is observed that the use of the FIGMN function approximator brings some important advantages to reinforcement learning in relation to conventional neural networks. Informática : Educação Aprendizagem : Computador na educação Redes neurais Reinforcement learning Neural networks Gaussian mixture models
24	It Is Better to Be Upside Than Sharpe! DApuzzo, Daniele 01 April 2017 (has links) Based on the assumption that returns in Commercial Real Estate are normally distributed, the Sharpe Ratio has been the standard risk-adjusted performance measure for the past several years. Research has questioned whether this assumption can be reasonably made. The Upside Potential Ratio as a risk-adjusted performance measure is an alternative to measure performance on a risk-adjusted basis but its values differ from the Sharpe Ratio's only in the assumption of skewed returns. We will provide reasonable evidence that CRE returns should not be fitted with a normal distribution and present the Gaussian Mixture Model as our choice of distribution to fit skewness. We will then use a GMM distribution to measure performance of CRE domestic markets via UPR. Additional insights will be presented by introducing an alternative risk-adjusted perfomance measure that we will call D-ratio. We will show how the UPR and the D-ratio can provide a tool-box that can be added to any existing investment strategy when identifying markets' past performance and timing of entrance. The intent of this thesis is not to provide a comprehensive framework for CRE investment decisions but to introduce statistical and mathematical tools that can serve any portfolio manager in augmenting any investment strategy already in place. Sharpe Ratio Real Estate Upside Potential Ratio Gaussian Mixture Models D-ratio Mathematics
25	Foreground Segmentation of Moving Objects Molin, Joel January 2010 (has links) <p>Foreground segmentation is a common first step in tracking and surveillance applications. The purpose of foreground segmentation is to provide later stages of image processing with an indication of where interesting data can be found. This thesis is an investigation of how foreground segmentation can be performed in two contexts: as a pre-step to trajectory tracking and as a pre-step in indoor surveillance applications.</p><p>Three methods are selected and detailed: a single Gaussian method, a Gaussian mixture model method, and a codebook method. Experiments are then performed on typical input video using the methods. It is concluded that the Gaussian mixture model produces the output which yields the best trajectories when used as input to the trajectory tracker. An extension is proposed to the Gaussian mixture model which reduces shadow, improving the performance of foreground segmentation in the surveillance context.</p> Foreground Segmentation Background Subtraction Gaussian Mixture Models Codebook Tracking Shadow Detection Auto Exposure Image analysis Bildanalys
26	Statistical Background Models with Shadow Detection for Video Based Tracking Wood, John January 2007 (has links) <p>A common problem when using background models to segment moving objects from video sequences is that objects cast shadow usually significantly differ from the background and therefore get detected as foreground. This causes several problems when extracting and labeling objects, such as object shape distortion and several objects merging together. The purpose of this thesis is to explore various possibilities to handle this problem.</p><p>Three methods for statistical background modeling are reviewed. All methods work on a per pixel basis, the first is based on approximating the median, the next on using Gaussian mixture models, and the last one is based on channel representation. It is concluded that all methods detect cast shadows as foreground.</p><p>A study of existing methods to handle cast shadows has been carried out in order to gain knowledge on the subject and get ideas. A common approach is to transform the RGB-color representation into a representation that separates color into intensity and chromatic components in order to determine whether or not newly sampled pixel-values are related to the background. The color spaces HSV, IHSL, CIELAB, YCbCr, and a color model proposed in the literature (Horprasert et al.) are discussed and compared for the purpose of shadow detection. It is concluded that Horprasert's color model is the most suitable for this purpose.</p><p>The thesis ends with a proposal of a method to combine background modeling using Gaussian mixture models with shadow detection using Horprasert's color model. It is concluded that, while not perfect, such a combination can be very helpful in segmenting objects and detecting their cast shadow.</p> Background Models Gaussian Mixture Models Shadow Detection Color Spaces Image analysis Bildanalys
27	Mixture model cluster analysis under different covariance structures using information complexity Erar, Bahar 01 August 2011 (has links) In this thesis, a mixture-model cluster analysis technique under different covariance structures of the component densities is developed and presented, to capture the compactness, orientation, shape, and the volume of component clusters in one expert system to handle Gaussian high dimensional heterogeneous data sets to achieve flexibility in currently practiced cluster analysis techniques. Two approaches to parameter estimation are considered and compared; one using the Expectation-Maximization (EM) algorithm and another following a Bayesian framework using the Gibbs sampler. We develop and score several forms of the ICOMP criterion of Bozdogan (1994, 2004) as our fitness function; to choose the number of component clusters, to choose the correct component covariance matrix structure among nine candidate covariance structures, and to select the optimal parameters and the best fitting mixture-model. We demonstrate our approach on simulated datasets and a real large data set, focusing on early detection of breast cancer. We show that our approach improves the probability of classification error over the existing methods. Gaussian mixture model-based clustering information complexity Gibbs sampler eigenvalue decomposition Multivariate Analysis Statistical Models
28	Model Based Speech Enhancement and Coding Zhao, David Yuheng January 2007 (has links) In mobile speech communication, adverse conditions, such as noisy acoustic environments and unreliable network connections, may severely degrade the intelligibility and natural- ness of the received speech quality, and increase the listening effort. This thesis focuses on countermeasures based on statistical signal processing techniques. The main body of the thesis consists of three research articles, targeting two specific problems: speech enhancement for noise reduction and flexible source coder design for unreliable networks. Papers A and B consider speech enhancement for noise reduction. New schemes based on an extension to the auto-regressive (AR) hidden Markov model (HMM) for speech and noise are proposed. Stochastic models for speech and noise gains (excitation variance from an AR model) are integrated into the HMM framework in order to improve the modeling of energy variation. The extended model is referred to as a stochastic-gain hidden Markov model (SG-HMM). The speech gain describes the energy variations of the speech phones, typically due to differences in pronunciation and/or different vocalizations of individual speakers. The noise gain improves the tracking of the time-varying energy of non-stationary noise, e.g., due to movement of the noise source. In Paper A, it is assumed that prior knowledge on the noise environment is available, so that a pre-trained noise model is used. In Paper B, the noise model is adaptive and the model parameters are estimated on-line from the noisy observations using a recursive estimation algorithm. Based on the speech and noise models, a novel Bayesian estimator of the clean speech is developed in Paper A, and an estimator of the noise power spectral density (PSD) in Paper B. It is demonstrated that the proposed schemes achieve more accurate models of speech and noise than traditional techniques, and as part of a speech enhancement system provide improved speech quality, particularly for non-stationary noise sources. In Paper C, a flexible entropy-constrained vector quantization scheme based on Gaus- sian mixture model (GMM), lattice quantization, and arithmetic coding is proposed. The method allows for changing the average rate in real-time, and facilitates adaptation to the currently available bandwidth of the network. A practical solution to the classical issue of indexing and entropy-coding the quantized code vectors is given. The proposed scheme has a computational complexity that is independent of rate, and quadratic with respect to vector dimension. Hence, the scheme can be applied to the quantization of source vectors in a high dimensional space. The theoretical performance of the scheme is analyzed under a high-rate assumption. It is shown that, at high rate, the scheme approaches the theoretically optimal performance, if the mixture components are located far apart. The practical performance of the scheme is confirmed through simulations on both synthetic and speech-derived source vectors. / QC 20100825 statistical model Gaussian mixture mdel (GMM) hidden Markov model (HMM) moise reduction Telecommunication Telekommunikation
29	Road Extraction From Satellite Images By Self-supervised Classification And Perceptual Grouping Sahin, Eda 01 January 2013 (has links) (PDF) Road network extraction from high resolution satellite imagery is the most frequently utilized technique for updating and correcting geographic information system (GIS) databases, registering multi-temporal images for change detection and automatically aligning spatial datasets. This advance method is widely employed due to the improvements in satellite technology such as development of new sensors for high resolution imagery. To avoid the cost of the human interaction, various automatic and semi-automatic road extraction methods are developed and proposed in the literature. The aim of this study is to develop a fully automatized method which can extract road networks by using the spectral and structural features of the roads. In order to achieve this goal we set various objectives and work them out one by one. First bjective is to obtain reliable road seeds, since they are crucial for determining road regions correctly in the classification step. Second objective is finding most onvenient features and classification method for the road extraction. The third objective is to locate road centerlines which are defines the road topology. A number of algorithms are developed and tested throughout the thesis to achieve these objectives and the advantages of the proposed ones are explained. The final version of the proposed algorithm is tested by three band (RGB) satellite images and the results are compared with other studies in the literature to illustrate the benefits of the proposed algorithm.
30	INFORMATION THEORETIC CRITERIA FOR IMAGE QUALITY ASSESSMENT BASED ON NATURAL SCENE STATISTICS Zhang, Di January 2006 (has links) Measurement of visual quality is crucial for various image and video processing applications. <br /><br /> The goal of objective image quality assessment is to introduce a computational quality metric that can predict image or video quality. Many methods have been proposed in the past decades. Traditionally, measurements convert the spatial data into some other feature domains, such as the Fourier domain, and detect the similarity, such as mean square distance or Minkowsky distance, between the test data and the reference or perfect data, however only limited success has been achieved. None of the complicated metrics show any great advantage over other existing metrics. <br /><br /> The common idea shared among many proposed objective quality metrics is that human visual error sensitivities vary in different spatial and temporal frequency and directional channels. In this thesis, image quality assessment is approached by proposing a novel framework to compute the lost information in each channel not the similarities as used in previous methods. Based on natural scene statistics and several image models, an information theoretic framework is designed to compute the perceptual information contained in images and evaluate image quality in the form of entropy. <br /><br /> The thesis is organized as follows. Chapter I give a general introduction about previous work in this research area and a brief description of the human visual system. In Chapter II statistical models for natural scenes are reviewed. Chapter III proposes the core ideas about the computation of the perceptual information contained in the images. In Chapter IV, information theoretic criteria for image quality assessment are defined. Chapter V presents the simulation results in detail. In the last chapter, future direction and improvements of this research are discussed. Systems Design Information Theory Gaussian Mixture Model Entropy Human Vision Model Perceptual Information

Search results