• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • Tagged with
  • 5
  • 5
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Fast Inference for Interactive Models of Text

Lund, Jeffrey A 01 September 2015 (has links)
Probabilistic models of text are a useful tool for enabling the analysis of large collections of digital text. For example, Latent Dirichlet Allocation can quickly produce topical summaries of large collections of text documents. Many important uses cases of such models include human interaction during the inference process for these models of text. For example, the Interactive Topic Model extends Latent Dirichlet Allocation to incorporate human expertiese during inference in order to produce topics which are better suited to individual user needs. However, interactive use cases of probabalistic models of text introduce new constraints on inference - the inference procedure must not only be accurate, but also fast enough to facilitate human interaction. If the inference is too slow, then the human interaction will be harmed, and the interactive aspect of the probalistic model will be less useful. Unfortunately, the most popular inference algorithms in use today either require strong approximations which can degrade the quality of some models, or require time-consuming sampling. We explore the use of Iterated Conditional Modes, an algorithm which is able to obtain locally optimal maximum a posteriori estimates, as an alternative to popular inference algorithms such as Gibbs sampling or mean field variational inference. Iterated Conditional Modes algorithm is not only fast enough to facilitate human interaction, but can produce better maximum a posteriori estimates than sampling. We demonstrate the superior performance of Iterated Conditional Modes on a wide variety of models. First we use a DP Mixture of Multinomials model applied to the problem of web search result cluster, and show that not only can we outperform previous methods in clustering quality, but we can achieve interactive runtimes when performing inference with Iterated Conditional Modes. We then apply Iterated Conditional Modes to the Interactive Topic Model. Not only is Iterated Conditional Modes much faster than the previous published Gibbs sampler, but we are better able to incorporate human feedback during inference, as measured by accuracy on a classification task using the resultant topic model. Finally, we utilize Iterated Conditional Modes with MomResp, a model used to aggregate multiple noisy crowdsourced data. Compared with Gibbs sampling, Iterated Conditional Modes is better able to recover ground truth labels from simulated noisy annotations, and runs orders of magnitude faster.
2

Segmentation vidéo et suivi d'objets multiples / Video segmentation and multiple object tracking

Kumar, Ratnesh 15 December 2014 (has links)
Dans cette thèse nous proposons de nouveaux algorithmes d'analyse vidéo. La première contribution de cette thèse concerne le domaine de la segmentation de vidéos avec pour objectif d'obtenir une segmentation dense et spatio-temporellement cohérente. Nous proposons de combiner les aspects spatiaux et temporels d'une vidéo en une seule notion, celle de Fibre. Une fibre est un ensemble de trajectoires qui sont spatialement connectées par un maillage. Les fibres sont construites en évaluant simultanément les aspects spatiaux et temporels. Par rapport a l’état de l'art une segmentation de vidéo a base de fibres présente comme avantages d’accéder naturellement au voisinage grâce au maillage et aux correspondances temporelles pour la plupart des pixels de la vidéo. De plus, cette segmentation à base de fibres a une complexité quasi linéaire par rapport au nombre de pixels. La deuxième contribution de cette thèse concerne le suivi d'objets multiples. Nous proposons une approche de suivi qui utilise des caractéristiques des points suivis, la cinématique des objets suivis et l'apparence globale des détections. L'unification de toutes ces caractéristiques est effectuée avec un champ conditionnel aléatoire. Ensuite ce modèle est optimisé en combinant les techniques de passage de message et une variante de processus ICM (Iterated Conditional Modes) pour inférer les trajectoires d'objet. Une troisième contribution mineure consiste dans le développement d'un descripteur pour la mise en correspondance d'apparences de personne. Toutes les approches proposées obtiennent des résultats compétitifs ou meilleurs (qualitativement et quantitativement) que l’état de l'art sur des base de données. / In this thesis we propose novel algorithms for video analysis. The first contribution of this thesis is in the domain of video segmentation wherein the objective is to obtain a dense and coherent spatio-temporal segmentation. We propose joining both spatial and temporal aspects of a video into a single notion Fiber. A fiber is a set of trajectories which are spatially connected by a mesh. Fibers are built by jointly assessing spatial and temporal aspects of the video. Compared to the state-of-the-art, a fiber based video segmentation presents advantages such as a natural spatio-temporal neighborhood accessor by a mesh, and temporal correspondences for most pixels in the video. Furthermore, this fiber-based segmentation is of quasi-linear complexity w.r.t. the number of pixels. The second contribution is in the realm of multiple object tracking. We proposed a tracking approach which utilizes cues from point tracks, kinematics of moving objects and global appearance of detections. Unification of all these cues is performed on a Conditional Random Field. Subsequently this model is optimized by a combination of message passing and an Iterated Conditional Modes (ICM) variant to infer object-trajectories. A third, minor, contribution relates to the development of suitable feature descriptor for appearance matching of persons. All of our proposed approaches achieve competitive and better results (both qualitatively and quantitatively) than state-of-the-art on open source datasets.
3

Stochastic Nested Aggregation for Images and Random Fields

Wesolkowski, Slawomir Bogumil 27 March 2007 (has links)
Image segmentation is a critical step in building a computer vision algorithm that is able to distinguish between separate objects in an image scene. Image segmentation is based on two fundamentally intertwined components: pixel comparison and pixel grouping. In the pixel comparison step, pixels are determined to be similar or different from each other. In pixel grouping, those pixels which are similar are grouped together to form meaningful regions which can later be processed. This thesis makes original contributions to both of those areas. First, given a Markov Random Field framework, a Stochastic Nested Aggregation (SNA) framework for pixel and region grouping is presented and thoroughly analyzed using a Potts model. This framework is applicable in general to graph partitioning and discrete estimation problems where pairwise energy models are used. Nested aggregation reduces the computational complexity of stochastic algorithms such as Simulated Annealing to order O(N) while at the same time allowing local deterministic approaches such as Iterated Conditional Modes to escape most local minima in order to become a global deterministic optimization method. SNA is further enhanced by the introduction of a Graduated Models strategy which allows an optimization algorithm to converge to the model via several intermediary models. A well-known special case of Graduated Models is the Highest Confidence First algorithm which merges pixels or regions that give the highest global energy decrease. Finally, SNA allows us to use different models at different levels of coarseness. For coarser levels, a mean-based Potts model is introduced in order to compute region-to-region gradients based on the region mean and not edge gradients. Second, we develop a probabilistic framework based on hypothesis testing in order to achieve color constancy in image segmentation. We develop three new shading invariant semi-metrics based on the Dichromatic Reflection Model. An RGB image is transformed into an R'G'B' highlight invariant space to remove any highlight components, and only the component representing color hue is preserved to remove shading effects. This transformation is applied successfully to one of the proposed distance measures. The probabilistic semi-metrics show similar performance to vector angle on images without saturated highlight pixels; however, for saturated regions, as well as very low intensity pixels, the probabilistic distance measures outperform vector angle. Third, for interferometric Synthetic Aperture Radar image processing we apply the Potts model using SNA to the phase unwrapping problem. We devise a new distance measure for identifying phase discontinuities based on the minimum coherence of two adjacent pixels and their phase difference. As a comparison we use the probabilistic cost function of Carballo as a distance measure for our experiments.
4

Stochastic Nested Aggregation for Images and Random Fields

Wesolkowski, Slawomir Bogumil 27 March 2007 (has links)
Image segmentation is a critical step in building a computer vision algorithm that is able to distinguish between separate objects in an image scene. Image segmentation is based on two fundamentally intertwined components: pixel comparison and pixel grouping. In the pixel comparison step, pixels are determined to be similar or different from each other. In pixel grouping, those pixels which are similar are grouped together to form meaningful regions which can later be processed. This thesis makes original contributions to both of those areas. First, given a Markov Random Field framework, a Stochastic Nested Aggregation (SNA) framework for pixel and region grouping is presented and thoroughly analyzed using a Potts model. This framework is applicable in general to graph partitioning and discrete estimation problems where pairwise energy models are used. Nested aggregation reduces the computational complexity of stochastic algorithms such as Simulated Annealing to order O(N) while at the same time allowing local deterministic approaches such as Iterated Conditional Modes to escape most local minima in order to become a global deterministic optimization method. SNA is further enhanced by the introduction of a Graduated Models strategy which allows an optimization algorithm to converge to the model via several intermediary models. A well-known special case of Graduated Models is the Highest Confidence First algorithm which merges pixels or regions that give the highest global energy decrease. Finally, SNA allows us to use different models at different levels of coarseness. For coarser levels, a mean-based Potts model is introduced in order to compute region-to-region gradients based on the region mean and not edge gradients. Second, we develop a probabilistic framework based on hypothesis testing in order to achieve color constancy in image segmentation. We develop three new shading invariant semi-metrics based on the Dichromatic Reflection Model. An RGB image is transformed into an R'G'B' highlight invariant space to remove any highlight components, and only the component representing color hue is preserved to remove shading effects. This transformation is applied successfully to one of the proposed distance measures. The probabilistic semi-metrics show similar performance to vector angle on images without saturated highlight pixels; however, for saturated regions, as well as very low intensity pixels, the probabilistic distance measures outperform vector angle. Third, for interferometric Synthetic Aperture Radar image processing we apply the Potts model using SNA to the phase unwrapping problem. We devise a new distance measure for identifying phase discontinuities based on the minimum coherence of two adjacent pixels and their phase difference. As a comparison we use the probabilistic cost function of Carballo as a distance measure for our experiments.
5

Visual Place Recognition in Changing Environments using Additional Data-Inherent Knowledge

Schubert, Stefan 15 November 2023 (has links)
Visual place recognition is the task of finding same places in a set of database images for a given set of query images. This becomes particularly challenging for long-term applications when the environmental condition changes between or within the database and query set, e.g., from day to night. Visual place recognition in changing environments can be used if global position data like GPS is not available or very inaccurate, or for redundancy. It is required for tasks like loop closure detection in SLAM, candidate selection for global localization, or multi-robot/multi-session mapping and map merging. In contrast to pure image retrieval, visual place recognition can often build upon additional information and data for improvements in performance, runtime, or memory usage. This includes additional data-inherent knowledge about information that is contained in the image sets themselves because of the way they were recorded. Using data-inherent knowledge avoids the dependency on other sensors, which increases the generality of methods for an integration into many existing place recognition pipelines. This thesis focuses on the usage of additional data-inherent knowledge. After the discussion of basics about visual place recognition, the thesis gives a systematic overview of existing data-inherent knowledge and corresponding methods. Subsequently, the thesis concentrates on a deeper consideration and exploitation of four different types of additional data-inherent knowledge. This includes 1) sequences, i.e., the database and query set are recorded as spatio-temporal sequences so that consecutive images are also adjacent in the world, 2) knowledge of whether the environmental conditions within the database and query set are constant or continuously changing, 3) intra-database similarities between the database images, and 4) intra-query similarities between the query images. Except for sequences, all types have received only little attention in the literature so far. For the exploitation of knowledge about constant conditions within the database and query set (e.g., database: summer, query: winter), the thesis evaluates different descriptor standardization techniques. For the alternative scenario of continuous condition changes (e.g., database: sunny to rainy, query: sunny to cloudy), the thesis first investigates the qualitative and quantitative impact on the performance of image descriptors. It then proposes and evaluates four unsupervised learning methods, including our novel clustering-based descriptor standardization method K-STD and three PCA-based methods from the literature. To address the high computational effort of descriptor comparisons during place recognition, our novel method EPR for efficient place recognition is proposed. Given a query descriptor, EPR uses sequence information and intra-database similarities to identify nearly all matching descriptors in the database. For a structured combination of several sources of additional knowledge in a single graph, the thesis presents our novel graphical framework for place recognition. After the minimization of the graph's error with our proposed ICM-based optimization, the place recognition performance can be significantly improved. For an extensive experimental evaluation of all methods in this thesis and beyond, a benchmark for visual place recognition in changing environments is presented, which is composed of six datasets with thirty sequence combinations.

Page generated in 0.1553 seconds