Global ETD Search

1	Improving Image Classification Performance using Joint Feature Selection Maboudi Afkham, Heydar January 2014 (has links) In this thesis, we focus on the problem of image classification and investigate how its performance can be systematically improved. Improving the performance of different computer vision methods has been the subject of many studies. While different studies take different approaches to achieve this improvement, in this thesis we address this problem by investigating the relevance of the statistics collected from the image. We propose a framework for gradually improving the quality of an already existing image descriptor. In our studies, we employ a descriptor which is composed the response of a series of discriminative components for summarizing each image. As we will show, this descriptor has an ideal form in which all categories become linearly separable. While, reaching this form is not possible, we will argue how by replacing a small fraction of these components, it is possible to obtain a descriptor which is, on average, closer to this ideal form. To do so, we initially identify which components do not contribute to the quality of the descriptor and replace them with more robust components. As we will show, this replacement has a positive effect on the quality of the descriptor. While there are many ways of obtaining more robust components, we introduce a joint feature selection problem to obtain image features that retains class discriminative properties while simultaneously generalising between within class variations. Our approach is based on the concept of a joint feature where several small features are combined in a spatial structure. The proposed framework automatically learns the structure of the joint constellations in a class dependent manner improving the generalisation and discrimination capabilities of the local descriptor while still retaining a low-dimensional representations. The joint feature selection problem discussed in this thesis belongs to a specific class of latent variable models that assumes each labeled sample is associated with a set of different features, with no prior knowledge of which feature is the most relevant feature to be used. Deformable-Part Models (DPM) can be seen as good examples of such models. These models are usually considered to be expensive to train and very sensitive to the initialization. Here, we focus on the learning of such models by introducing a topological framework and show how it is possible to both reduce the learning complexity and produce more robust decision boundaries. We will also argue how our framework can be used for producing robust decision boundaries without exploiting the dataset bias or relying on accurate annotations. To examine the hypothesis of this thesis, we evaluate different parts of our framework on several challenging datasets and demonstrate how our framework is capable of gradually improving the performance of image classification by collecting more robust statistics from the image and improving the quality of the descriptor. / <p>QC 20140506</p> Image Classification Latent Variable Models
2	ExploringWeakly Labeled Data Across the Noise-Bias Spectrum Fisher, Robert W. H. 01 April 2016 (has links) As the availability of unstructured data on the web continues to increase, it is becoming increasingly necessary to develop machine learning methods that rely less on human annotated training data. In this thesis, we present methods for learning from weakly labeled data. We present a unifying framework to understand weakly labeled data in terms of bias and noise and identify methods that are well suited to learning from certain types of weak labels. To compensate for the tremendous sizes of weakly labeled datasets, we leverage computationally efficient and statistically consistent spectral methods. Using these methods, we present results from four diverse, real-world applications coupled with a unifying simulation environment. This allows us to make general observations that would not be apparent when examining any one application on its own. These contributions allow us to significantly improve prediction when labeled data is available, and they also make learning tractable when the cost of acquiring annotated data is prohibitively high. Weakly labeled data spectral methods latent variable models
3	Generative Models for Video Analysis and 3D Range Data Applications Orriols Majoral, Xavier 27 February 2004 (has links) La mayoría de problemas en Visión por computador no contienen una relación directa entre el estímulo que proviene de sensores de tipo genérico y su correspondiente categoría perceptual. Este tipo de conexión requiere de una tarea de aprendizaje compleja. De hecho, las formas básicas de energía, y sus posibles combinaciones, son un número reducido en comparación a las infinitas categorías perceptuales correspondientes a objetos, acciones, relaciones entre objetos, etc. Dos factores principales determinan el nivel de dificultad de cada problema específico: i) los diferentes niveles de información que se utilizan, y ii) la complejidad del modelo que se emplea con el objetivo de explicar las observaciones. La elección de una representación adecuada para los datos toma una relevancia significativa cuando se tratan invariancias, dado que estas siempre implican una reducción del los grados de libertad del sistema, i.e., el número necesario de coordenadas para la representación es menor que el empleado en la captura de datos. De este modo, la descomposición en unidades básicas y el cambio de representación dan lugar a que un problema complejo se pueda transformar en uno de manejable. Esta simplificación del problema de la estimación debe depender del mecanismo propio de combinación de estas primitivas con el fin de obtener una descripción óptima del modelo complejo global. Esta tesis muestra como los Modelos de Variables Latentes reducen dimensionalidad, que teniendo en cuenta las simetrías internas del problema, ofrecen una manera de tratar con datos parciales y dan lugar a la posibilidad de predicciones de nuevas observaciones.Las líneas de investigación de esta tesis están dirigidas al manejo de datos provinentes de múltiples fuentes. Concretamente, esta tesis presenta un conjunto de nuevos algoritmos aplicados a dos áreas diferentes dentro de la Visión por Computador: i) video análisis y sumarización y ii) datos range 3D. Ambas áreas se han enfocado a través del marco de los Modelos Generativos, donde se han empleado protocolos similares para representar datos. / The majority of problems in Computer Vision do not contain a direct relation between the stimuli provided by a general purpose sensor and its corresponding perceptual category. A complex learning task must be involved in order to provide such a connection. In fact, the basic forms of energy, and their possible combinations are a reduced number compared to the infinite possible perceptual categories corresponding to objects, actions, relations among objects... Two main factors determine the level of difficulty of a specific problem: i) The different levels of information that are employed and ii) The complexity of the model that is intended to explain the observations.The choice of an appropriate representation for the data takes a significant relevance when it comes to deal with invariances, since these usually imply that the number of intrinsic degrees offreedom in the data distribution is lower than the coordinates used to represent it. Therefore, the decomposition into basic units (model parameters) and the change of representation, make that a complex problem can be transformed into a manageable one. This simplification of the estimation problem has to rely on a proper mechanism of combination of those primitives in order to give an optimal description of the global complex model. This thesis shows how Latent Variable Models reduce dimensionality, taking into account the internal symmetries of a problem, provide a manner of dealing with missing data and make possible predicting new observations. The lines of research of this thesis are directed to the management of multiple data sources. More specifically, this thesis presents a set of new algorithms applied to two different areas in Computer Vision: i) video analysis and summarization, and ii) 3D range data. Both areas have been approached through the Generative Models framework, where similar protocols for representing data have been employed. Computer vision Multimedia Latent Variable Models Tecnologies 68
4	Optimal Bayesian estimators for latent variable cluster models Rastelli, Riccardo, Friel, Nial 11 1900 (has links) (PDF) In cluster analysis interest lies in probabilistically capturing partitions of individuals, items or observations into groups, such that those belonging to the same group share similar attributes or relational profiles. Bayesian posterior samples for the latent allocation variables can be effectively obtained in a wide range of clustering models, including finite mixtures, infinite mixtures, hidden Markov models and block models for networks. However, due to the categorical nature of the clustering variables and the lack of scalable algorithms, summary tools that can interpret such samples are not available. We adopt a Bayesian decision theoretical approach to define an optimality criterion for clusterings and propose a fast and context-independent greedy algorithm to find the best allocations. One important facet of our approach is that the optimal number of groups is automatically selected, thereby solving the clustering and the model-choice problems at the same time. We consider several loss functions to compare partitions and show that our approach can accommodate a wide range of cases. Finally, we illustrate our approach on both artificial and real datasets for three different clustering models: Gaussian mixtures, stochastic block models and latent block models for networks.
5	Latent feature networks for statistical relational learning Khoshneshin, Mohammad 01 July 2012 (has links) In this dissertation, I explored relational learning via latent variable models. Traditional machine learning algorithms cannot handle many learning problems where there is a need for modeling both relations and noise. Statistical relational learning approaches emerged to handle these applications by incorporating both relations and uncertainties in these problems. Latent variable models are one of the successful approaches for statistical relational learning. These models assume a latent variable for each entity and then the probability distribution over relationships between entities is modeled via a function over latent variables. One important example of relational learning via latent variables is text data modeling. In text data modeling, we are interested in modeling the relationship between words and documents. Latent variable models learn this data by assuming a latent variable for each word and document. The co-occurrence value is defined as a function of these random variables. For modeling co-occurrence data in general (and text data in particular), we proposed latent logistic allocation (LLA). LLA outperforms the-state-of-the-art model --- latent Dirichlet allocation --- in text data modeling, document categorization and information retrieval. We also proposed query-based visualization which embeds documents relevant to a query in a 2-dimensional space. Additionally, I used latent variable models for other single-relational problems such as collaborative filtering and educational data mining. To move towards multi-relational learning via latent variable models, we propose latent feature networks (LFN). Multi-relational learning approaches model multiple relationships simultaneously. LFN assumes a component for each relationship. Each component is a latent variable model where a latent variable is defined for each entity and the relationship is a function of latent variables. However, if an entity participates in more than one relationship, then it will have a separate random variable for each relationship. We used LFN for modeling two different problems: microarray classification and social network analysis with a side network. In the first application, LFN outperforms support vector machines --- the best propositional model for that application. In the second application, using the side information via LFN can drastically improve the link prediction task in a social network. latent variable models multi-relational learning statistical machine learning statistical relational learning
6	Multivariate ordinal regression models: an analysis of corporate credit ratings Hirk, Rainer, Hornik, Kurt, Vana, Laura January 2018 (has links) (PDF) Correlated ordinal data typically arises from multiple measurements on a collection of subjects. Motivated by an application in credit risk, where multiple credit rating agencies assess the creditworthiness of a firm on an ordinal scale, we consider multivariate ordinal regression models with a latent variable specification and correlated error terms. Two different link functions are employed, by assuming a multivariate normal and a multivariate logistic distribution for the latent variables underlying the ordinal outcomes. Composite likelihood methods, more specifically the pairwise and tripletwise likelihood approach, are applied for estimating the model parameters. Using simulated data sets with varying number of subjects, we investigate the performance of the pairwise likelihood estimates and find them to be robust for both link functions and reasonable sample size. The empirical application consists of an analysis of corporate credit ratings from the big three credit rating agencies (Standard & Poor's, Moody's and Fitch). Firm-level and stock price data for publicly traded US firms as well as an unbalanced panel of issuer credit ratings are collected and analyzed to illustrate the proposed framework.
7	A dynamic network model to measure exposure diversification in the Austrian interbank market Hledik, Juraj, Rastelli, Riccardo 08 August 2018 (has links) (PDF) We propose a statistical model for weighted temporal networks capable of measuring the level of heterogeneity in a financial system. Our model focuses on the level of diversification of financial institutions; that is, whether they are more inclined to distribute their assets equally among partners, or if they rather concentrate their commitment towards a limited number of institutions. Crucially, a Markov property is introduced to capture time dependencies and to make our measures comparable across time. We apply the model on an original dataset of Austrian interbank exposures. The temporal span encompasses the onset and development of the financial crisis in 2008 as well as the beginnings of European sovereign debt crisis in 2011. Our analysis highlights an overall increasing trend for network homogeneity, whereby core banks have a tendency to distribute their market exposures more equally across their partners.
8	Multivariate Ordinal Regression Models: An Analysis of Corporate Credit Ratings Hirk, Rainer, Hornik, Kurt, Vana, Laura 01 1900 (has links) (PDF) Correlated ordinal data typically arise from multiple measurements on a collection of subjects. Motivated by an application in credit risk, where multiple credit rating agencies assess the creditworthiness of a firm on an ordinal scale, we consider multivariate ordinal models with a latent variable specification and correlated error terms. Two different link functions are employed, by assuming a multivariate normal and a multivariate logistic distribution for the latent variables underlying the ordinal outcomes. Composite likelihood methods, more specifically the pairwise and tripletwise likelihood approach, are applied for estimating the model parameters. We investigate how sensitive the pairwise likelihood estimates are to the number of subjects and to the presence of observations missing completely at random, and find that these estimates are robust for both link functions and reasonable sample size. The empirical application consists of an analysis of corporate credit ratings from the big three credit rating agencies (Standard & Poor's, Moody's and Fitch). Firm-level and stock price data for publicly traded US companies as well as an incomplete panel of issuer credit ratings are collected and analyzed to illustrate the proposed framework. / Series: Research Report Series / Department of Statistics and Mathematics
9	Model-based understanding of facial expressions Sauer, Patrick Martin January 2013 (has links) In this thesis we present novel methods for constructing and fitting 2d models of shape and appearance which are used for analysing human faces. The first contribution builds on previous work on discriminative fitting strategies for active appearance models (AAMs) in which regression models are trained to predict the location of shapes based on texture samples. In particular, we investigate non-parametric regression methods including random forests and Gaussian processes which are used together with gradient-like features for shape model fitting. We then develop two training algorithms which combine such models into sequences, and systematically compare their performance to existing linear generative AAM algorithms. Inspired by the performance of the Gaussian process-based regression methods, we investigate a group of non-linear latent variable models known as Gaussian process latent variable models (GPLVM). We discuss how such models may be used to develop a generative active appearance model algorithm whose texture model component is non-linear, and show how this leads to lower-dimensional models which are capable of generating more natural-looking images of faces when compared to equivalent linear models. We conclude by describing a novel supervised non-linear latent variable model based on Gaussian processes which we apply to the problem of recognising emotions from facial expressions. 153.6
10	Sur la méthode des moments pour l'estimation des modèles à variables latentes / On the method of moments for estimation in latent linear models Podosinnikova, Anastasia 01 December 2016 (has links) Les modèles linéaires latents sont des modèles statistique puissants pour extraire la structure latente utile à partir de données non structurées par ailleurs. Ces modèles sont utiles dans de nombreuses applications telles que le traitement automatique du langage naturel et la vision artificielle. Pourtant, l'estimation et l'inférence sont souvent impossibles en temps polynomial pour de nombreux modèles linéaires latents et on doit utiliser des méthodes approximatives pour lesquelles il est difficile de récupérer les paramètres. Plusieurs approches, introduites récemment, utilisent la méthode des moments. Elles permettent de retrouver les paramètres dans le cadre idéalisé d'un échantillon de données infini tiré selon certains modèles, mais ils viennent souvent avec des garanties théoriques dans les cas où ce n'est pas exactement satisfait. Dans cette thèse, nous nous concentrons sur les méthodes d'estimation fondées sur l'appariement de moment pour différents modèles linéaires latents. L'utilisation d'un lien étroit avec l'analyse en composantes indépendantes, qui est un outil bien étudié par la communauté du traitement du signal, nous présentons plusieurs modèles semiparamétriques pour la modélisation thématique et dans un contexte multi-vues. Nous présentons des méthodes à base de moment ainsi que des algorithmes pour l'estimation dans ces modèles, et nous prouvons pour ces méthodes des résultats de complexité améliorée par rapport aux méthodes existantes. Nous donnons également des garanties d'identifiabilité, contrairement à d'autres modèles actuels. C'est une propriété importante pour assurer leur interprétabilité. / Latent linear models are powerful probabilistic tools for extracting useful latent structure from otherwise unstructured data and have proved useful in numerous applications such as natural language processing and computer vision. However, the estimation and inference are often intractable for many latent linear models and one has to make use of approximate methods often with no recovery guarantees. An alternative approach, which has been popular lately, are methods based on the method of moments. These methods often have guarantees of exact recovery in the idealized setting of an infinite data sample and well specified models, but they also often come with theoretical guarantees in cases where this is not exactly satisfied. In this thesis, we focus on moment matchingbased estimation methods for different latent linear models. Using a close connection with independent component analysis, which is a well studied tool from the signal processing literature, we introduce several semiparametric models in the topic modeling context and for multi-view models and develop moment matching-based methods for the estimation in these models. These methods come with improved sample complexity results compared to the previously proposed methods. The models are supplemented with the identifiability guarantees, which is a necessary property to ensure their interpretability. This is opposed to some other widely used models, which are unidentifiable. Modèles thématiques Modèles à variables latentes Méthode des moments Topic models Latent variable models Method of moments 004

Search results