• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 2
  • Tagged with
  • 5
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Shape Recipes: Scene Representations that Refer to the Image

Freeman, William T., Torralba, Antonio 01 September 2002 (has links)
The goal of low-level vision is to estimate an underlying scene, given an observed image. Real-world scenes (e.g., albedos or shapes) can be very complex, conventionally requiring high dimensional representations which are hard to estimate and store. We propose a low-dimensional representation, called a scene recipe, that relies on the image itself to describe the complex scene configurations. Shape recipes are an example: these are the regression coefficients that predict the bandpassed shape from bandpassed image data. We describe the benefits of this representation, and show two uses illustrating their properties: (1) we improve stereo shape estimates by learning shape recipes at low resolution and applying them at full resolution; (2) Shape recipes implicitly contain information about lighting and materials and we use them for material segmentation.
2

Recognizing Indoor Scenes

Torralba, Antonio, Sinha, Pawan 25 July 2001 (has links)
We propose a scheme for indoor place identification based on the recognition of global scene views. Scene views are encoded using a holistic representation that provides low-resolution spatial and spectral information. The holistic nature of the representation dispenses with the need to rely on specific objects or local landmarks and also renders it robust against variations in object configurations. We demonstrate the scheme on the problem of recognizing scenes in video sequences captured while walking through an office environment. We develop a method for distinguishing between 'diagnostic' and 'generic' views and also evaluate changes in system performances as a function of the amount of training data available and the complexity of the representation.
3

Properties and Applications of Shape Recipes

Torralba, Antonio, Freeman, William T. 01 December 2002 (has links)
In low-level vision, the representation of scene properties such as shape, albedo, etc., are very high dimensional as they have to describe complicated structures. The approach proposed here is to let the image itself bear as much of the representational burden as possible. In many situations, scene and image are closely related and it is possible to find a functional relationship between them. The scene information can be represented in reference to the image where the functional specifies how to translate the image into the associated scene. We illustrate the use of this representation for encoding shape information. We show how this representation has appealing properties such as locality and slow variation across space and scale. These properties provide a way of improving shape estimates coming from other sources of information like stereo.
4

Prioritized 3d Scene Reconstruction And Rate-distortion Efficient Representation For Video Sequences

Imre, Evren 01 August 2007 (has links) (PDF)
In this dissertation, a novel scheme performing 3D reconstruction of a scene from a 2D video sequence is presented. To this aim, first, the trajectories of the salient features in the scene are determined as a sequence of displacements via Kanade-Lukas-Tomasi tracker and Kalman filter. Then, a tentative camera trajectory with respect to a metric reference reconstruction is estimated. All frame pairs are ordered with respect to their amenability to 3D reconstruction by a metric that utilizes the baseline distances and the number of tracked correspondences between the frames. The ordered frame pairs are processed via a sequential structure-from-motion algorithm to estimate the sparse structure and camera matrices. The metric and the associated reconstruction algorithm are shown to outperform their counterparts in the literature via experiments. Finally, a mesh-based, rate-distortion efficient representation is constructed through a novel procedure driven by the error between a target image, and its prediction from a reference image and the current mesh. At each iteration, the triangular patch, whose projection on the predicted image has the largest error, is identified. Within this projected region and its correspondence on the reference frame, feature matches are extracted. The pair with the least conformance to the planar model is used to determine the vertex to be added to the mesh. The procedure is shown to outperform the dense depth-map representation in all tested cases, and the block motion vector representation, in scenes with large depth range, in rate-distortion sense.
5

The Stixel World

Pfeiffer, David 31 August 2012 (has links)
Die Stixel-Welt ist eine neuartige und vielseitig einsetzbare Zwischenrepräsentation zur effizienten Beschreibung dreidimensionaler Szenen. Heutige stereobasierte Sehsysteme ermöglichen die Bestimmung einer Tiefenmessung für nahezu jeden Bildpunkt in Echtzeit. Das erlaubt zum einen die Anwendung neuer leistungsfähiger Algorithmen, doch gleichzeitig steigt die zu verarbeitende Datenmenge und der dadurch notwendig werdende Aufwand massiv an. Gerade im Hinblick auf die limitierte Rechenleistung jener Systeme, wie sie in der videobasierten Fahrerassistenz zum Einsatz kommen, ist dies eine große Herausforderung. Um dieses Problem zu lösen, bietet die Stixel-Welt eine generische Abstraktion der Rohdaten des Sensors. Jeder Stixel repräsentiert individuell einen Teil eines Objektes im Raum und segmentiert so die Umgebung in Freiraum und Objekte. Die Arbeit stellt die notwendigen Verfahren vor, um die Stixel-Welt mittels dynamischer Programmierung in einem einzigen globalen Optimierungsschritt in Echtzeit zu extrahieren. Dieser Prozess wird durch eine Vielzahl unterschiedlicher Annahmen über unsere von Menschenhand geschaffene Umgebung gestützt. Darauf aufbauend wird ein Kalmanfilter-basiertes Verfahren zur präzisen Bewegungsschätzung anderer Objekte vorgestellt. Die Arbeit stellt umfangreiche Bewertungen der zu erwartenden Leistungsfähigkeit aller vorgestellten Verfahren an. Dafür kommen sowohl vergleichende Ansätze als auch diverse Referenzsensoren, wie beispielsweise LIDAR, RADAR oder hochpräzise Inertialmesssysteme, zur Anwendung. Die Stixel-Welt ist eine extrem kompakte Abstraktion der dreidimensionalen Umgebung und bietet gleichzeitig einfachsten Zugriff auf alle essentiellen Informationen der Szene. Infolge dieser Arbeit war es möglich, die Effizienz vieler auf der Stixel-Welt aufbauender Algorithmen deutlich zu verbessern. / The Stixel World is a novel and versatile medium-level representation to efficiently bridge the gap between pixel-based processing and high-level vision. Modern stereo matching schemes allow to obtain a depth measurement for almost every pixel of an image in real-time, thus allowing the application of new and powerful algorithms. However, it also results in a large amount of measurement data that has to be processed and evaluated. With respect to vision-based driver assistance, these algorithms are executed on highly integrated low-power processing units that leave no room for algorithms with an intense calculation effort. At the same time, the growing number of independently executed vision tasks asks for new concepts to manage the resulting system complexity. These challenges are tackled by introducing a pre-processing step to extract all required information in advance. Each Stixel approximates a part of an object along with its distance and height. The Stixel World is computed in a single unified optimization scheme. Strong use is made of physically motivated a priori knowledge about our man-made three-dimensional environment. Relying on dynamic programming guarantees to extract the globally optimal segmentation for the entire scenario. Kalman filtering techniques are used to precisely estimate the motion state of all tracked objects. Particular emphasis is put on a thorough performance evaluation. Different comparative strategies are followed which include LIDAR, RADAR, and IMU reference sensors, manually created ground truth data, and real-world tests. Altogether, the Stixel World is ideally suited to serve as the basic building block for today''s increasingly complex vision systems. It is an extremely compact abstraction of the actual world giving access to the most essential information about the current scenario. Thanks to this thesis, the efficiency of subsequently executed vision algorithms and applications has improved significantly.

Page generated in 0.1607 seconds