Global ETD Search

1	Non-parametric Clustering and Topic Modeling via Small Variance Asymptotics with Local Search Singh, Siddharth January 2013 (has links) No description available. Computer Science Clustering Topic Modeling Non-parametric clustering Local search Merge Parameter selection
2	Reflection Symmetry Detection in Images : Application to Photography Analysis / Détection de symétrie réflexion dans les images : application à l'analyse photographique Elsayed Elawady, Mohamed 29 March 2019 (has links) La symétrie est une propriété géométrique importante en perception visuelle qui traduit notre perception des correspondances entre les différents objets ou formes présents dans une scène. Elle est utilisée comme élément caractéristique dans de nombreuses applications de la vision par ordinateur (comme par exemple la détection, la segmentation ou la reconnaissance d'objets) mais également comme une caractéristique formelle en sciences de l'art (ou en analyse esthétique). D’importants progrès ont été réalisés ces dernières décennies pour la détection de la symétrie dans les images mais il reste encore de nombreux verrous à lever. Dans cette thèse, nous nous intéressons à la détection des symétries de réflexion, dans des images réelles, à l'échelle globale. Nos principales contributions concernent les étapes d'extraction de caractéristiques et de représentation globale des axes de symétrie. Nous proposons d'abord une nouvelle méthode d'extraction de segments de contours à l'aide de bancs de filtres de Gabor logarithmiques et une mesure de symétrie intersegments basée sur des caractéristiques locales de forme, de texture et de couleur. Cette méthode a remporté la première place à la dernière compétition internationale de symétrie pour la détection mono- et multi-axes. Notre deuxième contribution concerne une nouvelle méthode de représentation des axes de symétrie dans un espace linéaire-directionnel. Les propriétés de symétrie sont représentées sous la forme d'une densité de probabilité qui peut être estimée, de manière non-paramétrique, par une méthode à noyauxbasée sur la distribution de Von Mises-Fisher. Nous montrons que la détection des axes dominants peut ensuite être réalisée à partir d'un algorithme de type "mean-shift” associé à une distance adaptée. Nous introduisons également une nouvelle base d'images pour la détection de symétrie mono-axe dans des photographies professionnelles issue de la base à grande échelle AVA (Aestetic Visual Analysis). Nos différentes contributions obtiennent des résultats meilleurs que les algorithmes de l'état de l'art, évalués sur toutes les bases disponibles publiquement, spécialement dans le cas multi-axes. Nous concluons que les propriétés de symétrie peuvent être utilisées comme des caractéristiques visuelles de niveau sémantique intermédiaire pour l'analyse et la compréhension de photographies. / Symmetry is a fundamental principle of the visual perception to feel the equally distributed weights within foreground objects inside an image. It is used as a significant visual feature through various computer vision applications (i.e. object detection and segmentation), plus as an important composition measure in art domain (i.e. aesthetic analysis). The development of symmetry detection has been improved rapidly since last century. In this thesis, we mainly aim to propose new approaches to detect reflection symmetry inside real-world images in a global scale. In particular, our main contributions concern feature extraction and globalrepresentation of symmetry axes. First, we propose a novel approach that detects global salient edges inside an image using Log-Gabor filter banks, and defines symmetry oriented similarity through textural and color around these edges. This method wins a recent symmetry competition worldwide in single and multiple cases.Second, we introduce a weighted kernel density estimator to represent linear and directional symmetrical candidates in a continuous way, then propose a joint Gaussian-vonMises distance inside the mean-shift algorithm, to select the relevant symmetry axis candidates along side with their symmetrical densities. In addition, we introduce a new challenging dataset of single symmetry axes inside artistic photographies extracted from the large-scale Aesthetic Visual Analysis (AVA) dataset. The proposed contributions obtain superior results against state-of-art algorithms among all public datasets, especially multiple cases in a global scale. We conclude that the spatial and context information of each candidate axis inside an image can be used as a local or global symmetry measure for further image analysis and scene understanding purposes. Détection des symétries Symétrie de réflexion Extraction de segments de contours Estimation par noyau Similarité par paire Histogramme de symétrie Algorithme mean-shift Symmetry Detection Reflection Symmetry Edge Features Pairwise Similarity Symmetry Histogram Non-parametric clustering Kernel Estimation Mean-Shift
3	Non-Parametric Clustering of Multivariate Count Data Tekumalla, Lavanya Sita January 2017 (has links) (PDF) The focus of this thesis is models for non-parametric clustering of multivariate count data. While there has been significant work in Bayesian non-parametric modelling in the last decade, in the context of mixture models for real-valued data and some forms of discrete data such as multinomial-mixtures, there has been much less work on non-parametric clustering of Multi-variate Count Data. The main challenges in clustering multivariate counts include choosing a suitable multivariate distribution that adequately captures the properties of the data, for instance handling over-dispersed data or sparse multivariate data, at the same time leveraging the inherent dependency structure between dimensions and across instances to get meaningful clusters. As the first contribution, this thesis explores extensions to the Multivariate Poisson distribution, proposing efficient algorithms for non-parametric clustering of multivariate count data. While Poisson is the most popular distribution for count modelling, the Multivariate Poisson often leads to intractable inference and a suboptimal t of the data. To address this, we introduce a family of models based on the Sparse-Multivariate Poisson, that exploit the inherent sparsity in multivariate data, reducing the number of latent variables in the formulation of Multivariate Poisson leading to a better t and more efficient inference. We explore Dirichlet process mixture model extensions and temporal non-parametric extensions to models based on the Sparse Multivariate Poisson for practical use of Poisson based models for non-parametric clustering of multivariate counts in real-world applications. As a second contribution, this thesis addresses moving beyond the limitations of Poisson based models for non-parametric clustering, for instance in handling over dispersed data or data with negative correlations. We explore, for the first time, marginal independent inference techniques based on the Gaussian Copula for multivariate count data in the Dirichlet Process mixture model setting. This enables non-parametric clustering of multivariate counts without limiting assumptions that usually restrict the marginal to belong to a particular family, such as the Poisson or the negative-binomial. This inference technique can also work for mixed data (combination of counts, binary and continuous data) enabling Bayesian non-parametric modelling to be used for a wide variety of data types. As the third contribution, this thesis addresses modelling a wide range of more complex dependencies such as asymmetric and tail dependencies during non-parametric clustering of multivariate count data with Vine Copula based Dirichlet process mixtures. While vine copula inference has been well explored for continuous data, it is still a topic of active research for multivariate counts and mixed multivariate data. Inference for multivariate counts and mixed data is a hard problem owing to ties that arise with discrete marginal. An efficient marginal independent inference approach based on extended rank likelihood, based on recent work in the statistics literature, is proposed in this thesis, extending the use vines for multivariate counts and mixed data in practical clustering scenarios. This thesis also explores the novel systems application of Bulk Cache Preloading by analysing I/O traces though predictive models for temporal non-parametric clustering of multivariate count data. State of the art techniques in the caching domain are limited to exploiting short-range correlations in memory accesses at the milli-second granularity or smaller and cannot leverage long range correlations in traces. We explore for the first time, Bulk Cache Preloading, the process of pro-actively predicting data to load into cache, minutes or hours before the actual request from the application, by leveraging longer range correlation at the granularity of minutes or hours. This enables the development of machine learning techniques tailored for caching due to relaxed timing constraints. Our approach involves a data aggregation process, converting I/O traces into a temporal sequence of multivariate counts, that we analyse with the temporal non-parametric clustering models proposed in this thesis. While the focus of our thesis is models for non-parametric clustering for discrete data, particularly multivariate counts, we also hope our work on bulk cache preloading paves the way to more inter-disciplinary research for using data mining techniques in the systems domain. As an additional contribution, this thesis addresses multi-level non-parametric admixture modelling for discrete data in the form of grouped categorical data, such as document collections. Non-parametric clustering for topic modelling in document collections, where a document is as-associated with an unknown number of semantic themes or topics, is well explored with admixture models such as the Hierarchical Dirichlet Process. However, there exist scenarios, where a doc-ument requires being associated with themes at multiple levels, where each theme is itself an admixture over themes at the previous level, motivating the need for multilevel admixtures. Consider the example of non-parametric entity-topic modelling of simultaneously learning entities and topics from document collections. This can be realized by modelling a document as an admixture over entities while entities could themselves be modeled as admixtures over topics. We propose the nested Hierarchical Dirichlet Process to address this gap and apply a two level version of our model to automatically learn author entities and topics from research corpora. Multivariate Count Data Clustering Mixture Models Non-parametric Clustering Bulk Cache Preloading Dirichlet Process Mixture Models Spatio-Temporal Data Aggregation Sparse Multivariate Poisson MultiVariate Poisson (MVP) Copulas Nested Hierarchical Dirichlet Processes Dirichlet Process Mixtures Sparse-Multivariate Poisson Dirichlet Process Mixture Model Computer Science

1

Page generated in 0.1123 seconds