Global ETD Search

41	Optimizing motion estimation in MPEG-2 standard Saparon, Azilah January 2005 (has links) An efficient compression algorithm is vital for storage and transmission of video signal. Many video coding standards such as ISO MPEG-1/2 and ITU-T H.261/262/263 apply block motion estimation and compensation algorithms to exploit temporal redundancies where reduction is the key to high performance in video coding. The full search algorithm is a brute force for block motion estimation method used in the standards. It offers the best quality so far but its high computational complexity makes it unsuitable for real-time implementations. This thesis proposes and demonstrates a new scanning order that will minimize the computational complexity of the matching process especially in full search algorithm either with similar or acceptable degradation in quality performance. 006.696
42	The effect of audio on the visual perception of high-fidelity animated 3D computer graphics Mastoropoulou, Georgia January 2007 (has links) Sound is often an integral part of interactive animated scenarios, such as VR applications, games and realistic simulations. Up to now, research has focussed on how visual stimuli can affect the user's perceived quality of the rendered graphics. Although there is a plethora of evidence coming from the psychology field about crossmodal interactions between visual and auditory stimuli, graphics researchers have not yet considered exploiting sound in order to affect the perceived quality of the 3D visual environment. Furthermore, although it is well known that sound is attention grabbing, researchers until now have not considered using sound to direct gaze to specific objects/areas in the visual environment, in order to allow for the selective rendering of the scene: only the sound emitting objects at high quality, while the reduced quality of the rest of the 3D scene goes unnoticed by the observers. This gap we are trying to fill with our research in the influence of auditory stimuli on the perceived visual quality (rendering quality and frame rate) of computer generated animated imagery. To gain a better understanding of the crossmodal interactions between the auditory and visual sensory modalities and identify whether such interactions could lead to a new generation of perceptually-adaptive graphics techniques, that would take into account not only the visual stimuli but also the auditory background of a 3D scene, 292 subjects participated in five experiments. Temporal and visual display quality perceptions were investigated by manipulating the frame rate and the rendering quality (number of rays shot per pixel of the image), separately, and by considering different auditory backgrounds. Our experimental studies verified that we can affect the viewer's perception of delivered frame rate with the use of audio. Further results show that the viewers do fixate to sources of sound effects in a scene- even when engaged in a demanding visual task- allowing us to render the corresponding pixels to high quality and significantly drop the quality for the rest of the scene, without any noticeable difference to the observer. In both cases, we save processing resources and/or significant computational time. 006.696
43	An integrated block based motion estimation framework for video applications Bhaskar, Harish January 2007 (has links) Motion Estimation is a popular technique for computing the displacement vectors of objects or attributes between image frames at different time stamps. Motion estimation is critical and forms an integral part of many application domains such as video coding, compression, object tracking, video indexing, video stabilization, etc. However, the available motion models are restricted in their generality and have been tailored for use in specific application areas. The purpose of this thesis is to propose an integrated block-based motion estimation framework that serves different real-time video applications including; object tracking, video stabilization and low level video indexing. In this thesis, the proposed framework for motion estimation is based on block matching, a well known strategy for motion estimation particularly in the domain of video coding and compression. Traditional block matching techniques are limited in the following ways: i) block partitioning methods, whether fixed or variable sized, divide image frames into blocks blindly neglecting the features involved within. ii) block search schemes assume restricted translational displacernents (usually bounded by a search window: time complexity rises exponentially against search size). iii) during block matching, blocks undergo negligible or null rotntional motion and cannot suffer complex deformation characteristics during motion. iv) finally, block matching methods arc incapable of handling occlusion. The thesis will propose an Integrated block-based motion estimation framework that handles the aforementioned limitations of existing schemes. First, we propose a vector quantization based block partitioning methodology that will extend the quad-tree mechanism, but, place partitions such that similar or near-similar attributes are clustered within the same block. In this way we preserve the advantages of variable block matching by using a well defined quad-tree data structure and separate regions of interest from the rest of the image at the same time. At the second level of abstraction, we propose a genetic algorithm based block search scheine. A genetic algorithm based search mechanism will present a similar range of computation time irrespective of the amount of displacement. This will allow the search space to remain unrestricted and maintain tolerable time complexity. An immediate extension to the basic framework is presented as a rotation invariant scheme that is further generalized into a deformation handling mechanism for motion estimation. We use an affine based integrated model along-side the genetic algorithm search to match blocks (in turn image attributes) that may or may not undergo rotation or deformation during motion estimation. We also present a novel extension to the 2D affine genetic algorithm combination for handling specific 3D rotational changes to blocks. In this thesis, we also propose to integrate a novel motion correction mechanism based on probabilistic motion modeling for occlusion handling. Finally, for optimization purposes we combine scale space based architecture to the framework. This optimization procedure will allow the system to automatically choose, based on performance metrics, the operating resolution so that the quality to time ratio is maintained. We test the developed framework in different real-time applications. First, we present object tracking using block-based motion estimation. We prove through this research that more reliable object tracking results can be obtained by combining motion characteristics with feature tracking. Second, we present a video stabilization model using the proposed motion estimation technique. In this application, we cornpare the performance of the proposed block-based motion estimation scheme to the techniques specified in the literature and hence prove the efficiency and robustness of the framework. Finally, we tackle the problem of low level video indexing using a weighted combination of features during motion estimation. 006.696
44	Highly scalable 2D model-based video coding Hu, Mingyou January 2005 (has links) With rapid mergers of computer, communications, and entertainment industries, we can expect a trend of growing heterogeneity (in channel bandwidth, receiver capacity, etc.) for future digital video coding applications. Furthermore, some new functions appear, such as object manipulation, which should be supported by the video coding techniques. The traditional video coding approach is very constrained and inefficient to the heterogeneity issue and user interaction. Scalable coding, allowing partial decoding at a variety of resolution, temporal, quality, and object levels from a single compressed codestream, is widely considered as a promising technology for efficient signal representation and transmission in a heterogeneous environment. However, although several scalable algorithms have been proposed in the literature and the international standards over the last decade, further research is necessary to improve the compression performance of scalable video coding. This thesis investigates scalable 2D model-based video coding method with efficient video compression as well as excellent scalability performance, in order to satisfy the newly appeared requirements. It first examines main model-based video coding techniques and scalable video coding methods. Also, the parametric video models that describe the real world and image generation process are briefly described. Next, video segmentation algorithms are investigated to semantically represent the video frame into video objects. At the first frame, the texture information and the motion from first several frames are used to extract the semantic foreground objects. For some sequences, user interaction is required to get semantic objects. In later frames, the proposed complexity-scalable contour-tracking algorithm is used to segment each frame. After that, each object is progressively approximated using three-layer 2D mesh model. In order to represent the motion of human face more precisely, face detection and modelling are also investigated. This technique, in which human face is modelled separately, is shown to produce improvements of object motion representation. Scalable model compression is also outlined in this thesis. Object model is represented into two parts: object shape and interior object model, which are compressed separately. A scalable contour approximation algorithm is proposed. Both intra- and predictive scalable shape-coding algorithms are investigated and proposed to code the object shape progressively. The encoded coarser layers are used to improve the coding efficiency of the current layer. The effectiveness of these algorithms is demonstrated through the results of extensive experiments. We also investigate the scalable texture coding of video objects. An improved shape-adaptive SPECK algorithm is employed in intra-texture coding and is also used for residual texture coding after motion compensated temporal filtering. During motion compensated temporal filtering, scalable mesh object model is used, and scalable motion vector coding is achieved using CABAC codec. A hierarchically structured bitstream is created, which is optimised for rate-distortion, to facilitate efficient bit truncation and bit allocation among video frames and video objects. The coding system can encode/decode the video object independently and generate a separate bit stream for each object. As is exhibited in our experiments, such a high coding scalability in the proposed coding system is achieved without a significant cost in compression performance commonly experienced in most scalable coding systems. 006.696
45	Multiple description coding for 3D video Karim, Hezerul Abdul January 2008 (has links) In the near future, 3D video is likely to be used to enhance video applications, as it offers a greater sense of immersion. When 3D video is compressed and transmitted over error prone channels, the associated packet loss leads to poor visual quality. Hence, error resilience techniques for 3D video are needed. This thesis aims to improve the error robustness of the compressed 3D video in error prone transmission scenarios. Firstly, this thesis describes how 3D video can be represented using 2D video information, and depth information. This format can be compressed using tools available in some video coding standards, including Multiple Auxiliary Component (MAC) tool in MPEG-4 version 2, and the use of reduced resolution coding for depth compression. It is observed that the reduced resolution depth compression provides improved 2D video performance. However, the quality of the depth information is limited at high bit rates due to the distortion introduced by down-sampling and up-sampling (DSUS). Secondly, Multiple Description Coding (MDC), based on even and odd frames is proposed for error resilient 3D video. Improvements are made to the original scheme by adding a controllable amount of side information to improve frame interpolation at the decoder and compression efficiency. The side information is also sent according to the video sequence motion for further improvement. The performances of the proposed MDC algorithms are found to be better than single description coding (SDC) and the original scheme at high error rates with reduced error free coding efficiency. Finally, the combination of Scalable Video Coding (SVC) and MDC (scalable MDC) for 3D video is investigated for error robustness and scalability. A scalable MDC scheme based on even and odd frames is proposed for H.264 based SVC. Reduced resolution depth compression is then applied to improve the performance. The proposed algorithms provide better 3D video performance than the original SVC in error prone environments and for low bit-rate video. Key words: stereoscopic 3D video coding, 2D and depth, error resilience, multiple description video coding, scalable multiple description video coding. 006.696
46	Advanced motion estimation algorithms in the frequency domain for digital video applications Argyriou, Vasileios January 2006 (has links) Motion estimation is a technique that is used frequently within the fields of image and video processing. Motion estimation describes the process of determining the motion between two or more frames in an image sequence. There are several different approaches to estimating the motion present within a scene. In general, the most well known motion estimation algorithms can be separated into fixed or variable block size, object based and dense motion estimation methods. Motion estimation has a variety of important applications, such as video coding, frame rate conversion, de-interlacing, object tracking and spatio-temporal segmentation. Furthermore there are medical, military and security applications. The proper motion measurement method is selected, based on the application and the available computational power. Several such motion estimation techniques are described in detail, all of which operating in the frequency domain based on phase correlation. The main objective and prepuce of this study is to improve the state-of-the-art motion estimation techniques that operate in the frequency domain, based on phase correlation and to introduce novel methods providing more accurate and reliable estimates. Furthermore, research is carried out to investigate and suggest algorithms for all motion estimation categories, based on the density of the optical flow, starting from block-based and moving to dense vector fields. Highly accurate and computationally efficient block-based techniques, utilising either gradient information or hypercomplex correlation, are suggested being suitable for estimation of motion in video sequences improving the baseline phase correlation method. Furthermore, a novel sub-pixel motion estimation technique using phase correlation, resulting in high-accuracy motion estimates, is presented in detail. A quad-tree scheme for obtaining variable size block-based sub-pixel estimates of interframe motion in the frequency domain is proposed, using either key features of the phase correlation surface or the motion compensated prediction error to manage the partition of parent block to four children quadrants. Sub-pixel estimates for arbitrarily shaped regions are obtained in a proposed frequency domain scheme without extrapolation to be required, based on phase correlation and the shape adaptive discrete Fourier transform. In the last part of this study, two fast dense motion estimation methods operating in the frequency domain are presented based either on texture segmentation or multi overlapped correlation, utilising either weighted averages or the novel gradient normalised convolution to restore missing motion vectors of the resulting dense vector field, requiring significant lower computational power compared to spatial and robust algorithms. Based on the performance study of the proposed frequency domain motion estimation techniques, performance advantages over the baseline phase correlation are achieved in terms of the motion compensated prediction error and zero-order entropy indicating higher level of compressibility and improved motion vector coherence. One of the most attractive features of the proposed schemes is that they enjoy a high degree of computational efficiency and can be implemented by fast transformation algorithms in the frequency domain. Concluding, it should be mentioned that according to the results of each of the proposed schemes, their complexity and performance are making them attractive for low computational power and real time applications. Furthermore, they provide comparable estimates to spatial domain techniques and estimates closer to the real motion present in the scene making them suitable for object tracking and 3-D scene reconstruction. 006.696
47	An image-based approach to the rendering of crowds in real-time Tecchia, Franco January 2007 (has links) The wide use of computer graphics in games, entertainment, medical, architectural and cultural applications, has led it to becoming a prevalent area of research. Games and entertainment in general have become one of the driving forces of the real-time computer graphics industry, bringing reasonably realistic, complex and appealing virtual worlds to the mass-market. At the current stage of technology, an user can interactively navigate through complex, polygon-based scenes rendered with sophisticated lighting, at times interacting with Al-based synthetic characters or virtual humans. As the size and complexity of the environments continuously increase, there is a growing need to add common real-life phenomenons such as the presence of crowds. Rendering highly populated urban environments requires a good synthesis of two partially overlapping problems: the management of vast and detailed environments, and the visualisation of large-scale crowds. While in the past a large amount of research has gone into the investigation of optimisation strategies and speed-up methods dedicated to large and static polygonal models, real-time visualisation of animated crowds poses novel challenges due to the computational power needed to visualize a multitude of animated characters scenarios where thousands of characters are on-screen simultaneously can easily lead to polygon budgets exceeding millions of triangles, that are hardly possible to render at interactive frame-rates even on the most powerful graphics technology available today. The present thesis extensively investigates this topic, and proposes the usage of an image-based data representation in order to speed-up rendering of animated characters so to achieve interactive frame-rates even when crowds composed by thousands of individuals are on-screen. We advance over the state of the art in this field introducing a novel form of impostor rendering techniques for animated crowds, and presenting methods that exploit this novel form of representation to handle also advanced rendering effects such as crowd lighting and shadowing we also discuss the important aspects of compatibility of our new method with existing polygonal based scene-graphs architectures. The results are showing that real-time crowd rendering is an application scenario where the introduction of Image Based Rendering methods can truly be beneficial, making possible the rendering of crowds composed by thousands of individual in real time in any existing rendering software framework even on commodity hardware. 006.696
48	Content scalability in multiple description image and video coding Majid, Muhammad January 2011 (has links) High compression ratio, scalability and reliability are the main issues for transmitting multimedia content over best effort networks. Scalable image and video coding meets the user requirements by truncating the scalable bitstream at different quality, resolution and frame rate. However, the performance of scalable coding deteriorates rapidly over packet networks if the base layer packets are lost during transmission. Multiple description coding (MDC) has emerged as an effective source coding technique for robust image and video transmission over lossy networks. In this research problem of incorporating scalability in MDC for robust image and video transmission over best effort network is addressed. The first contribution of this thesis is to propose a strategy for generating more than two descriptions using multiple description scalar quantizer (MDSQ) with an objective to jointly decoded any number of descriptions in balanced and unbalanced manner. The distortion constraints and design conditions for multichannel unbalanced description coding (MUDC) using several MDSQs for improving quality as the number of description is increased are formulated. Secondly, the design of MDSQ is extended to incorporate the quality scalability in each de- scription by using the concept of successive refinement in the side quantizers of multiple description scalar quantizer called MDSQ-SR. The design conditions of the MDSQ-SR are formulated with the objective to improve the quality of side and joint decoding for any combination of quality refinement layers. The joint decoding of different spatial resolution descriptions having different quality re- finement layers is demonstrated for images by combining MUDC and MDSQ-SR schemes respectively. Finally, a fully scalable multiple description video coding (SMDVC) scheme is proposed by integrating MUDC and MDSQ-SR schemes in a motion compensated temporal filtering based video coding framework. The pro- posed SMDVC scheme is capable of generating and joint decoding any number of descriptions in balanced and unbalanced manner at any quality, resolution and frame rate. According to the experimental results the unbalanced joint decoding results into 1.1 dB better peak to signal noise ratio (PSNR) than the balanced joint decod- ing at the same data rate. Furthermore, the joint decoding of MDSQ-SR based scheme gives an average of 1.35 dB and 0.3 dB better PSNR performance with re- spect to the state-of-the-art embedded-MDSQ for images and video respectively. The PSNR performance of the MDSQ-SR based video scheme is improved by 0.2-0.6 dB by controlling inter description and motion vector redundancies. In addition to superior rate-distortion performance than embedded-MDSQ, MDSQ- SR has reduced the computational complexity by 83%. 006.696
49	Motion segmentation of semantic objects in video sequences Thirde, David J. January 2007 (has links) The extraction of meaningful objects from video sequences is becoming increasingly important in many multimedia applications such as video compression or video post-production. The goal of this thesis is to review, evaluate and build upon the wealth of recent work on the problem of video object segmentation in the context of probabilistic techniques for generic video object segmentation. Methods are suggested that solve this problem using formal probabilistic learning techniques, this allows principled justification of methods applied to the problem of segmenting video objects. By applying a simple, but effective, evaluation methodology the impact of all aspects of the video object segmentation process are quantitatively analysed. This research focuses on the application of feature spaces and probabilistic models for video object segmentation are investigated. Subsequently, an efficient region-based approach to object segmentation is described along with an evaluation of mechanisms for updating such a representation. Finally, a hierarchical Bayesian framework is proposed to allow efficient implementation and comparison of combined region-level and object-level representational schemes. 006.696 Computer science and informatics
50	Motion scalability for video coding with flexible spatio-temporal decompositions Mrak, Marta January 2007 (has links) The research presented in this thesis aims to extend the scalability range of the wavelet-based video coding systems in order to achieve fully scalable coding with a wide range of available decoding points. Since the temporal redundancy regularly comprises the main portion of the global video sequence redundancy, the techniques that can be generally termed motion decorrelation techniques have a central role in the overall compression performance. For this reason the scalable motion modelling and coding are of utmost importance, and specifically, in this thesis possible solutions are identified and analysed. The main contributions of the presented research are grouped into two interrelated and complementary topics. Firstly a flexible motion model with rateoptimised estimation technique is introduced. The proposed motion model is based on tree structures and allows high adaptability needed for layered motion coding. The flexible structure for motion compensation allows for optimisation at different stages of the adaptive spatio-temporal decomposition, which is crucial for scalable coding that targets decoding on different resolutions. By utilising an adaptive choice of wavelet filterbank, the model enables high compression based on efficient mode selection. Secondly, solutions for scalable motion modelling and coding are developed. These solutions are based on precision limiting of motion vectors and creation of a layered motion structure that describes hierarchically coded motion. The solution based on precision limiting relies on layered bit-plane coding of motion vector values. The second solution builds on recently established techniques that impose scalability on a motion structure. The new approach is based on two major improvements: the evaluation of distortion in temporal Subbands and motion search in temporal subbands that finds the optimal motion vectors for layered motion structure. Exhaustive tests on the rate-distortion performance in demanding scalable video coding scenarios show benefits of application of both developed flexible motion model and various solutions for scalable motion coding. 006.696 Electronic Engineering

Search results