• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 181
  • 56
  • 9
  • 9
  • 6
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 288
  • 288
  • 89
  • 82
  • 80
  • 72
  • 47
  • 46
  • 43
  • 41
  • 41
  • 37
  • 36
  • 33
  • 33
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.

Error resilience for video coding services over packet-based networks

Zhang, Jian, Electrical Engineering, Australian Defence Force Academy, UNSW January 1999 (has links)
Error resilience is an important issue when coded video data is transmitted over wired and wireless networks. Errors can be introduced by network congestion, mis-routing and channel noise. These transmission errors can result in bit errors being introduced into the transmitted data or packets of data being completely lost. Consequently, the quality of the decoded video is degraded significantly. This thesis describes new techniques for minimising this degradation. To verify video error resilience tools, it is first necessary to consider the methods used to carry out experimental measurements. For most audio-visual services, streams of both audio and video data need to be simultaneously transmitted on a single channel. The inclusion of the impact of multiplexing schemes, such as MPEG 2 Systems, in error resilience studies is also an important consideration. It is shown that error resilience measurements including the effect of the Systems Layer differ significantly from those based only on the Video Layer. Two major issues of error resilience are investigated within this thesis. They are resynchronisation after error detection and error concealment. Results for resynchronisation using small slices, adaptive slice sizes and macroblock resynchronisation schemes are provided. These measurements show that the macroblock resynchronisation scheme achieves the best performance although it is not included in MPEG2 standard. The performance of the adaptive slice size scheme, however, is similar to that of the macroblock resynchronisation scheme. This approach is compatible with the MPEG 2 standard. The most important contribution of this thesis is a new concealment technique, namely, Decoder Motion Vector Estimation (DMVE). The decoded video quality can be improved significantly with this technique. Basically, this technique utilises the temporal redundancy between the current and the previous frames, and the correlation between lost macroblocks and their surrounding pixels. Therefore, motion estimation can be applied again to search in the previous picture for a match to those lost macroblocks. The process is similar to that the encoder performs, but it is in the decoder. The integration of techniques such as DMVE with small slices, or adaptive slice sizes or macroblock resynchronisation is also evaluated. This provides an overview of the performance produced by individual techniques compared to the combined techniques. Results show that high performance can be achieved by integrating DMVE with an effective resynchronisation scheme, even at a high cell loss rates. The results of this thesis demonstrate clearly that the MPEG 2 standard is capable of providing a high level of error resilience, even in the presence of high loss. The key to this performance is appropriate tuning of encoders and effective concealment in decoders.

Transform Coefficient Thresholding and Lagrangian Optimization for H.264 Video Coding / Transformkoefficient-tröskling och Lagrangeoptimering för H.264 Videokodning

Carlsson, Pontus January 2004 (has links)
<p>H.264, also known as MPEG-4 Part 10: Advanced Video Coding, is the latest MPEG standard for video coding. It provides approximately 50% bit rate savings for equivalent perceptual quality compared to any previous standard. In the same fashion as previous MPEG standards, only the bitstream syntax and the decoder are specified. Hence, coding performance is not only determined by the standard itself but also by the implementation of the encoder. In this report we propose two methods for improving the coding performance while remaining fully compliant to the standard. </p><p>After transformation and quantization, the transform coefficients are usually entropy coded and embedded in the bitstream. However, some of them might be beneficial to discard if the number of saved bits are sufficiently large. This is usually referred to as coefficient thresholding and is investigated in the scope of H.264 in this report. </p><p>Lagrangian optimization for video compression has proven to yield substantial improvements in perceived quality and the H.264 Reference Software has been designed around this concept. When performing Lagrangian optimization, lambda is a crucial parameter that determines the tradeoff between rate and distortion. We propose a new method to select lambda and the quantization parameter for non-reference frames in H.264. </p><p>The two methods are shown to achieve significant improvements. When combined, they reduce the bitrate around 12%, while preserving the video quality in terms of average PSNR. </p><p>To aid development of H.264, a software tool has been created to visualize the coding process and present statistics. This tool is capable of displaying information such as bit distribution, motion vectors, predicted pictures and motion compensated block sizes.</p>

Fast Mode Selection Algoritm for H.264 Video Coding

Hållmarker, Ola, Linderoth, Martin January 2005 (has links)
<p>ITU - T and the Moving Picture Expert Group (MPEG) have jointly, under the name of Joint Video Team (JVT), developed a new video coding standard. The standard is called H.264 and is also known as Advanced Video Coding (AVC) or MPEG-4 part 10. Comparisons shows that H.264 greatly outperforms MPEG-2, currently used in DVD and digital TV. H.264 halves the bit rate with equal image quality. The great rate - distortion performance means nevertheless a high computational complexity. Especially on the encoder side.</p><p>Handling of audio and video, e.g. compressing and filtering, is quite complex and requires high performance hardware and software. A video encoder consists of a number of modules that find the best coding parameters. For each macroblock several $modes$ are evaluated in order to achieve optimal coding. The reference implementation of H.264 uses a brute force search for this mode selection which is extremely computational constraining. In order to perform video encoding with satisfactory speed there is an obvious need for reducing the amount of modes that are evaluated.</p><p>This thesis proposes an algorithm which reduces the number of modes and reference frames that are evaluated. The algorithm can be regulated in order to fulfill the demand on quality versus speed. Six times faster encoding can be obtained without loosing perceptual image quality. By allowing some quality degradation the encoding becomes up to 20 times faster.</p>

Improving user comprehension and entertainment in wireless streaming media : introducing cognitive quality of service

Wikstrand, Greger January 2003 (has links)
<p>In future mobile networks service quality might be poor. A new measure is needed to be able to assess services in terms of their effectiveness and usefulness despite their lacking visual appeal. Cognitive Quality of Service is a way to measure the effectiveness in use of a networked service.</p><p>This thesis introduces Cognitive Quality of Service and puts it in relation to other ways to measure quality in streaming media. Through four studies the concept is used to improve multicast performance in a WLAN, to assess the effectiveness of simple animations compared to video, to build an application that fuses video and animations and to assess the differences between various levels of streaming video quality.</p><p>Guidelines on how to measure Cognitive Quality of Service are introduced based on a review of available literature and later analyzed in light of the studies presented in the thesis. It turns out that the guidelines are sound and should be used as a basis for assessing Cognitive Quality of Service.</p><p>Finally, the usefullness of Cognitive Quality of Service is analyzed. It turns out that it is especially useful when comparing different media, e.g. animations and video. In the video only case even bit-rate might be a useful predictor of subjective quality.</p> / <p>I framtiden kommer användare att titta på videosekvenser i trådlösa apparater, exempelvis mobiltelefoner. På grund av tekniska faktorer som störningar och på grund av kostnaden för det kommer den kvalitet som de erhåller inte att vara jämförbar med till exempel den kvalitet som kan erhållas när man tittar på tv. Trots det kan man anta att sådan video kan vara intressant och upplysande.</p><p>I avhandlingen introduceras och används begreppet Cognitive Quality of Service (CQoS) - kognitiv servicekvalitet. Begreppet definieras av att den överföring som ger den bästa förståelsen och känslomässiga reaktionen också har bäst CQoS. För att mäta CQoS bör man följa vissa riktlinjer, särskilt som det är svårt att mäta förståelse i samband med att man tittar på video.</p><p>Författaren har tillsammans med medarbetare tittat på hur man kan förbättra förhållandena för själva radioöverföringen (studie I). Genom en algoritm som ger multicast-paket bättre skydd mot kollisioner visas att man kan erhålla förbättrad överföringskapacitet för strömmande video i ett trådlöst nätverk.</p><p>Animeringar är ett alternativ till video som kräver låg bandbredd. I ett experiment har man undersökt hur väl animeringar står sig mot video av olika kvalitet när det gäller att upplysa användaren och ge en bra upplevelse (studie II). Det visade sig att animeringar var bättre för förståelsen medan video gav en bättre känslomässig upplevelse. Vanare åskådare föredrog videon medan ovanare åskådare föredrog animeringarna.</p><p>Frågan som ställdes var då hur man kunde kombinera respektive mediums fördelar för att få en så bra blandning som möjligt. Å ena sidan var animeringarna billiga och lätta att förstå medan videon var dyrare och mer intressant. Ett prototypgränssnitt skapades. Där kunde användaren själv välja vilken mix mellan de två alternativen som skulle visas (studie III). Det visade sig att försökspanelen föredrog video och dessutom ville ha mer information om spelare och match.</p><p>Trots animationernas förträfflighet kan man anta att det ändå är video som kommer att dominera i framtiden. En sista studie genomfördes för att se om man kunde finna liknande resultat vid olika kvalitetsgrader i video som man tidigare hade funnit mellan video och animeringar (studie IV). Det visade sig att så länge man höll sig till ett format var sambanden enklare. Mer var helt enkelt bättre upp till en viss gräns där det inte tillförde mer att öka överföringsresurserna för videon.</p><p>Sammanfattningsvis visar studierna att CQoS kan ge värdefull designkunskap. I synnerhet när man jämför olika presentationsformer som i det här fallet animeringar och video. Nästa steg blir att gå vidare med att applicera CQoS i tvåvägskommunikation, särskilt i Conversational Multimedia (CMM)– ungefär bildtelefoni – där det är särskilt goda möjligheter att sammanställa en för omständigheterna anpassad mediamix.</p>

SSIM-Inspired Quality Assessment, Compression, and Processing for Visual Communications

Rehman, Abdul January 2013 (has links)
Objective Image and Video Quality Assessment (I/VQA) measures predict image/video quality as perceived by human beings - the ultimate consumers of visual data. Existing research in the area is mainly limited to benchmarking and monitoring of visual data. The use of I/VQA measures in the design and optimization of image/video processing algorithms and systems is more desirable, challenging and fruitful but has not been well explored. Among the recently proposed objective I/VQA approaches, the structural similarity (SSIM) index and its variants have emerged as promising measures that show superior performance as compared to the widely used mean squared error (MSE) and are computationally simple compared with other state-of-the-art perceptual quality measures. In addition, SSIM has a number of desirable mathematical properties for optimization tasks. The goal of this research is to break the tradition of using MSE as the optimization criterion for image and video processing algorithms. We tackle several important problems in visual communication applications by exploiting SSIM-inspired design and optimization to achieve significantly better performance. Firstly, the original SSIM is a Full-Reference IQA (FR-IQA) measure that requires access to the original reference image, making it impractical in many visual communication applications. We propose a general purpose Reduced-Reference IQA (RR-IQA) method that can estimate SSIM with high accuracy with the help of a small number of RR features extracted from the original image. Furthermore, we introduce and demonstrate the novel idea of partially repairing an image using RR features. Secondly, image processing algorithms such as image de-noising and image super-resolution are required at various stages of visual communication systems, starting from image acquisition to image display at the receiver. We incorporate SSIM into the framework of sparse signal representation and non-local means methods and demonstrate improved performance in image de-noising and super-resolution. Thirdly, we incorporate SSIM into the framework of perceptual video compression. We propose an SSIM-based rate-distortion optimization scheme and an SSIM-inspired divisive optimization method that transforms the DCT domain frame residuals to a perceptually uniform space. Both approaches demonstrate the potential to largely improve the rate-distortion performance of state-of-the-art video codecs. Finally, in real-world visual communications, it is a common experience that end-users receive video with significantly time-varying quality due to the variations in video content/complexity, codec configuration, and network conditions. How human visual quality of experience (QoE) changes with such time-varying video quality is not yet well-understood. We propose a quality adaptation model that is asymmetrically tuned to increasing and decreasing quality. The model improves upon the direct SSIM approach in predicting subjective perceptual experience of time-varying video quality.

Transform Coefficient Thresholding and Lagrangian Optimization for H.264 Video Coding / Transformkoefficient-tröskling och Lagrangeoptimering för H.264 Videokodning

Carlsson, Pontus January 2004 (has links)
H.264, also known as MPEG-4 Part 10: Advanced Video Coding, is the latest MPEG standard for video coding. It provides approximately 50% bit rate savings for equivalent perceptual quality compared to any previous standard. In the same fashion as previous MPEG standards, only the bitstream syntax and the decoder are specified. Hence, coding performance is not only determined by the standard itself but also by the implementation of the encoder. In this report we propose two methods for improving the coding performance while remaining fully compliant to the standard. After transformation and quantization, the transform coefficients are usually entropy coded and embedded in the bitstream. However, some of them might be beneficial to discard if the number of saved bits are sufficiently large. This is usually referred to as coefficient thresholding and is investigated in the scope of H.264 in this report. Lagrangian optimization for video compression has proven to yield substantial improvements in perceived quality and the H.264 Reference Software has been designed around this concept. When performing Lagrangian optimization, lambda is a crucial parameter that determines the tradeoff between rate and distortion. We propose a new method to select lambda and the quantization parameter for non-reference frames in H.264. The two methods are shown to achieve significant improvements. When combined, they reduce the bitrate around 12%, while preserving the video quality in terms of average PSNR. To aid development of H.264, a software tool has been created to visualize the coding process and present statistics. This tool is capable of displaying information such as bit distribution, motion vectors, predicted pictures and motion compensated block sizes.

Fast Mode Selection Algoritm for H.264 Video Coding

Hållmarker, Ola, Linderoth, Martin January 2005 (has links)
ITU - T and the Moving Picture Expert Group (MPEG) have jointly, under the name of Joint Video Team (JVT), developed a new video coding standard. The standard is called H.264 and is also known as Advanced Video Coding (AVC) or MPEG-4 part 10. Comparisons shows that H.264 greatly outperforms MPEG-2, currently used in DVD and digital TV. H.264 halves the bit rate with equal image quality. The great rate - distortion performance means nevertheless a high computational complexity. Especially on the encoder side. Handling of audio and video, e.g. compressing and filtering, is quite complex and requires high performance hardware and software. A video encoder consists of a number of modules that find the best coding parameters. For each macroblock several $modes$ are evaluated in order to achieve optimal coding. The reference implementation of H.264 uses a brute force search for this mode selection which is extremely computational constraining. In order to perform video encoding with satisfactory speed there is an obvious need for reducing the amount of modes that are evaluated. This thesis proposes an algorithm which reduces the number of modes and reference frames that are evaluated. The algorithm can be regulated in order to fulfill the demand on quality versus speed. Six times faster encoding can be obtained without loosing perceptual image quality. By allowing some quality degradation the encoding becomes up to 20 times faster.

Multi-view Video Coding Via Dense Depth Field

Ozkalayci, Burak Oguz 01 September 2006 (has links) (PDF)
Emerging 3-D applications and 3-D display technologies raise some transmission problems of the next-generation multimedia data. Multi-view Video Coding (MVC) is one of the challenging topics in this area, that is on its road for standardization via ISO MPEG. In this thesis, a 3-D geometry-based MVC approach is proposed and analyzed in terms of its compression performance. For this purpose, the overall study is partitioned into three preceding parts. The first step is dense depth estimation of a view from a fully calibrated multi-view set. The calibration information and smoothness assumptions are utilized for determining dense correspondences via a Markov Random Field (MRF) model, which is solved by Belief Propagation (BP) method. In the second part, the estimated dense depth maps are utilized for generating (predicting) arbitrary (other camera) views of a scene, that is known as novel view generation. A 3-D warping algorithm, which is followed by an occlusion-compatible hole-filling process, is implemented for this aim. In order to suppress the occlusion artifacts, an intermediate novel view generation method, which fuses two novel views generated from different source views, is developed. Finally, for the last part, dense depth estimation and intermediate novel view generation tools are utilized in the proposed H.264-based MVC scheme for the removal of the spatial redundancies between different views. The performance of the proposed approach is compared against the simulcast coding and a recent MVC proposal, which is expected to be the standard recommendation for MPEG in the near future. These results show that the geometric approaches in MVC can still be utilized, especially in certain 3-D applications, in addition to conventional temporal motion compensation techniques, although the rate-distortion performances of geometry-free approaches are quite superior.

Scalable video communications: bitstream extraction algorithms for streaming, conferencing and 3DTV

Palaniappan, Ramanathan 19 August 2011 (has links)
This research investigates scalable video communications and its applications to video streaming, conferencing and 3DTV. Scalable video coding (SVC) is a layer-based encoding scheme that provides spatial, temporal and quality scalability. Heterogeneity of the Internet and clients' operating environment necessitate the adaptation of media content to ensure a satisfactory multimedia experience. SVC's layer structure allows the extraction of partial bitstreams at reduced spatial, quality and temporal resolutions that adjust the media bitrate at a fine granularity to changes in network state. The main focus of this research work is in developing such extraction algorithms in the context of SVC. Based on a combination of metadata computations and prediction mechanisms, these algorithms evaluate the quality contribution of each layer in the SVC bitstream and make extraction decisions that are aimed at maximizing video quality while operating within the available bandwidth resources. These techniques are applied in two-way interaction and one-way streaming of 2D and 3D content. Depending on the delay tolerance of these applications, rate-distortion optimized extraction algorithms are proposed. For conferencing applications, the extraction decisions are made over single frames and frame pairs due to tight end-to-end delay constraints. The proposed extraction algorithms for 3D content streaming maximize the overall perceived 3D quality based on human stereoscopic perception. When compared to current extraction methods, the new algorithms offer better video quality at a given bitrate while performing lesser number of metadata computations in the post-encoding phase. The solutions proposed for each application achieve the recurring goal of maintaining the best possible level of end-user quality of multimedia experience in spite of network impairments.

SSIM-Inspired Quality Assessment, Compression, and Processing for Visual Communications

Rehman, Abdul January 2013 (has links)
Objective Image and Video Quality Assessment (I/VQA) measures predict image/video quality as perceived by human beings - the ultimate consumers of visual data. Existing research in the area is mainly limited to benchmarking and monitoring of visual data. The use of I/VQA measures in the design and optimization of image/video processing algorithms and systems is more desirable, challenging and fruitful but has not been well explored. Among the recently proposed objective I/VQA approaches, the structural similarity (SSIM) index and its variants have emerged as promising measures that show superior performance as compared to the widely used mean squared error (MSE) and are computationally simple compared with other state-of-the-art perceptual quality measures. In addition, SSIM has a number of desirable mathematical properties for optimization tasks. The goal of this research is to break the tradition of using MSE as the optimization criterion for image and video processing algorithms. We tackle several important problems in visual communication applications by exploiting SSIM-inspired design and optimization to achieve significantly better performance. Firstly, the original SSIM is a Full-Reference IQA (FR-IQA) measure that requires access to the original reference image, making it impractical in many visual communication applications. We propose a general purpose Reduced-Reference IQA (RR-IQA) method that can estimate SSIM with high accuracy with the help of a small number of RR features extracted from the original image. Furthermore, we introduce and demonstrate the novel idea of partially repairing an image using RR features. Secondly, image processing algorithms such as image de-noising and image super-resolution are required at various stages of visual communication systems, starting from image acquisition to image display at the receiver. We incorporate SSIM into the framework of sparse signal representation and non-local means methods and demonstrate improved performance in image de-noising and super-resolution. Thirdly, we incorporate SSIM into the framework of perceptual video compression. We propose an SSIM-based rate-distortion optimization scheme and an SSIM-inspired divisive optimization method that transforms the DCT domain frame residuals to a perceptually uniform space. Both approaches demonstrate the potential to largely improve the rate-distortion performance of state-of-the-art video codecs. Finally, in real-world visual communications, it is a common experience that end-users receive video with significantly time-varying quality due to the variations in video content/complexity, codec configuration, and network conditions. How human visual quality of experience (QoE) changes with such time-varying video quality is not yet well-understood. We propose a quality adaptation model that is asymmetrically tuned to increasing and decreasing quality. The model improves upon the direct SSIM approach in predicting subjective perceptual experience of time-varying video quality.

Page generated in 0.056 seconds