This thesis involves systems for virtual presence in remote locations, a field referred to as telepresence. Recent image-based representations such as Google map's street view provide a familiar example. Several areas of research are open; such image-based representations are huge in size and the necessity to compress data efficiently for storage is inevitable. On the other hand, users are usually located in remote areas, and thus efficient transmission of the visual information is another issue of great importance.
In this work, real-world images are used in preference to computer graphics representations, mainly due to the photorealism that they provide as well as to avoid the high computational cost required for simulating large-scale environments. The cubic format is selected for panoramas in this thesis. A major feature of the captured cubic-panoramic image datasets in this work is the assumption of static scenes, and major issues of the system are compression efficiency and random access for storage, as well as computational complexity for transmission upon remote users' requests.
First, in order to enable smooth navigation across different view-points, a method for aligning cubic-panorama image datasets by using the geometry of the scene is proposed and tested. Feature detection and camera calibration are incorporated and unlike the existing method, which is limited to a pair of panoramas, our approach is applicable to datasets with a large number of panoramic images, with no need for extra numerical estimation.
Second, the problem of cubic-panorama image dataset compression is addressed in a number of ways. Two state-of-the-art approaches, namely the standardized scheme of H.264 and a wavelet-based codec named Dirac, are used and compared for the application of virtual navigation in image based representations of real world environments. Different frame prediction structures and group of pictures lengths are investigated and compared for this new type of visual data. At this stage, based on the obtained results, an efficient prediction structure and bitstream syntax using features of the data as well as satisfying major requirements of the system are proposed.
Third, we have proposed novel methods to address the important issue of disparity estimation. A client-server based scheme is assumed and a remote user is assumed to seek information at each navigation step. Considering the compression stage, a fast method that uses our previous work on the geometry of the scene as well as the proposed prediction structure together with the cubic format of panoramas is used to estimate disparity vectors efficiently.
Considering the transmission stage, a new transcoding scheme is introduced and a number of different frame-format conversion scenarios are addressed towards the goal of free navigation. Different types of navigation scenarios including forward or backward navigation, as well as user pan, tilt, and zoom are addressed. In all the aforementioned cases, results are compared both visually through error images and videos as well as using the objective measures. Altogether free navigation within the captured panoramic image datasets will be facilitated using our work and it can be incorporated in state-of-the-art of emerging cubic-panorama image dataset compression/transmission schemes.
Identifer | oai:union.ndltd.org:LACETR/oai:collectionscanada.gc.ca:OOU.#10393/24053 |
Date | 23 April 2013 |
Creators | Salehi Doolabi, Saeed |
Source Sets | Library and Archives Canada ETDs Repository / Centre d'archives des thèses électroniques de Bibliothèque et Archives Canada |
Language | English |
Detected Language | English |
Type | Thèse / Thesis |
Page generated in 0.0052 seconds