• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 1
  • 1
  • Tagged with
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Steps towards the object semantic hierarchy

Xu, Changhai, 1977- 17 November 2011 (has links)
An intelligent robot must be able to perceive and reason robustly about its world in terms of objects, among other foundational concepts. The robot can draw on rich data for object perception from continuous sensory input, in contrast to the usual formulation that focuses on objects in isolated still images. Additionally, the robot needs multiple object representations to deal with different tasks and/or different classes of objects. We propose the Object Semantic Hierarchy (OSH), which consists of multiple representations with different ontologies. The OSH factors the problems of object perception so that intermediate states of knowledge about an object have natural representations, with relatively easy transitions from less structured to more structured representations. Each layer in the hierarchy builds an explanation of the sensory input stream, in terms of a stochastic model consisting of a deterministic model and an unexplained "noise" term. Each layer is constructed by identifying new invariants from the previous layer. In the final model, the scene is explained in terms of constant background and object models, and low-dimensional dynamic poses of the observer and objects. The OSH contains two types of layers: the Object Layers and the Model Layers. The Object Layers describe how the static background and each foreground object are individuated, and the Model Layers describe how the model for the static background or each foreground object evolves from less structured to more structured representations. Each object or background model contains the following layers: (1) 2D object in 2D space (2D2D): a set of constant 2D object views, and the time-variant 2D object poses, (2) 2D object in 3D space (2D3D): a collection of constant 2D components, with their individual time-variant 3D poses, and (3) 3D object in 3D space (3D3D): the same collection of constant 2D components but with invariant relations among their 3D poses, and the time-variant 3D pose of the object as a whole. In building 2D2D object models, a fundamental problem is to segment out foreground objects in the pixel-level sensory input from the background environment, where motion information is an important cue to perform the segmentation. Traditional approaches for moving object segmentation usually appeal to motion analysis on pure image information without exploiting the robot's motor signals. We observe, however, that the background motion (from the robot's egocentric view) has stronger correlation to the robot's motor signals than the motion of foreground objects. Based on this observation, we propose a novel approach to segmenting moving objects by learning homography and fundamental matrices from motor signals. In building 2D3D and 3D3D object models, estimating camera motion parameters plays a key role. We propose a novel method for camera motion estimation that takes advantage of both planar features and point features and fuses constraints from both homography and essential matrices in a single probabilistic framework. Using planar features greatly improves estimation accuracy over using point features only, and with the help of point features, the solution ambiguity from a planar feature is resolved. Compared to the two classic approaches that apply the constraint of either homography or essential matrix, the proposed method gives more accurate estimation results and avoids the drawbacks of the two approaches. / text
2

Extension automatique de l'annotation d'images pour la recherche et la classification / Automatic image annotation extension for search and classification

Bouzayani, Abdessalem 09 May 2018 (has links)
Cette thèse traite le problème d’extension d’annotation d’images. En effet, la croissance rapide des archives de contenus visuels disponibles a engendré un besoin en techniques d’indexation et de recherche d’information multimédia. L’annotation d’images permet l’indexation et la recherche dans des grandes collections d’images d’une façon facile et rapide. À partir de bases d’images partiellement annotées manuellement, nous souhaitons compléter les annotations de ces bases, grâce à l’annotation automatique, pour pouvoir rendre plus efficaces les méthodes de recherche et/ou classification d’images. Pour l’extension automatique d’annotation d’images, nous avons utilisé les modèles graphiques probabilistes. Le modèle proposé est un mélange de distributions multinomiales et de mélanges de Gaussiennes où nous avons combiné des caractéristiques visuelles et textuelles. Pour réduire le coût de l’annotation manuelle et améliorer la qualité de l’annotation obtenue, nous avons intégré des retours utilisateur dans notre modèle. Les retours utilisateur ont été effectués en utilisant l’apprentissage dans l’apprentissage, l’apprentissage incrémental et l’apprentissage actif. Pour combler le problème du fossé sémantique et enrichir l’annotation d’images, nous avons utilisé une hiérarchie sémantique en modélisant de nombreuses relations sémantiques entre les mots-clés d’annotation. Nous avons donc présenté une méthode semi-automatique pour construire une hiérarchie sémantique à partie d’un ensemble de mots-clés. Après la construction de la hiérarchie, nous l’avons intégré dans notre modèle d’annotation d’images. Le modèle obtenu avec la hiérarchie est un mélange de distributions de Bernoulli et de mélanges de Gaussiennes / This thesis deals the problem of image annotation extension. Indeed, the fast growth of available visual contents has led a need for indexing and searching of multimedia information methods. Image annotation allows indexing and searching in a large collection of images in an easy and fast way. We wish, from partially manually annotated images databases, complete automatically the annotation of these sets, in order to make methods of research and / or classification of images more efficient. For automatic image annotation extension, we use probabilistic graphical models. The proposed model is based on a mixture of multinomial distributions and mixtures of Gaussian where we have combined visual and textual characteristics. To reduce the cost of manual annotation and improve the quality of the annotation obtained, we have incorporated user feedback into our model. User feedback was done using learning in learning, incremental learning and active learning. To reduce the semantic gap problem and to enrich the image annotation, we use a semantic hierarchy by modeling many semantic relationships between keywords. We present a semi-automatic method to build a semantic hierarchy from a set of keywords. After building the hierarchy, we integrate it into our image annotation model. The model obtained with this hierarchy is a mixture of Bernoulli distributions and Gaussian mixtures

Page generated in 0.0852 seconds