Spelling suggestions: "subject:"video compression"" "subject:"ideo compression""
291 |
On Enhancement and Quality Assessment of Audio and Video in Communication SystemsRossholm, Andreas January 2014 (has links)
The use of audio and video communication has increased exponentially over the last decade and has gone from speech over GSM to HD resolution video conference between continents on mobile devices. As the use becomes more widespread the interest in delivering high quality media increases even on devices with limited resources. This includes both development and enhancement of the communication chain but also the topic of objective measurements of the perceived quality. The focus of this thesis work has been to perform enhancement within speech encoding and video decoding, to measure influence factors of audio and video performance, and to build methods to predict the perceived video quality. The audio enhancement part of this thesis addresses the well known problem in the GSM system with an interfering signal generated by the switching nature of TDMA cellular telephony. Two different solutions are given to suppress such interference internally in the mobile handset. The first method involves the use of subtractive noise cancellation employing correlators, the second uses a structure of IIR notch filters. Both solutions use control algorithms based on the state of the communication between the mobile handset and the base station. The video enhancement part presents two post-filters. These two filters are designed to improve visual quality of highly compressed video streams from standard, block-based video codecs by combating both blocking and ringing artifacts. The second post-filter also performs sharpening. The third part addresses the problem of measuring audio and video delay as well as skewness between these, also known as synchronization. This method is a black box technique which enables it to be applied on any audiovisual application, proprietary as well as open standards, and can be run on any platform and over any network connectivity. The last part addresses no-reference (NR) bitstream video quality prediction using features extracted from the coded video stream. Several methods have been used and evaluated: Multiple Linear Regression (MLR), Artificial Neural Network (ANN), and Least Square Support Vector Machines (LS-SVM), showing high correlation with both MOS and objective video assessment methods as PSNR and PEVQ. The impact from temporal, spatial and quantization variations on perceptual video quality has also been addressed, together with the trade off between these, and for this purpose a set of locally conducted subjective experiments were performed.
|
292 |
[en] FAST MOTION ADAPTIVE ESTIMATION ALGORITHM APPLIED TO THE H.261/AVC STANDARD CODER / [pt] ALGORITMO RÁPIDO DE ESTIMAÇÃO ADAPTATIVO AO MOVIMENTO APLICADO AO CODIFICADOR PADRÃO H.264/AVCGUILHERME MACHADO GOEHRINGER 31 March 2008 (has links)
[pt] As técnicas de estimação de movimento utilizadas nos
padrões de compressão de vídeo proporcionam a utilização
mais eficiente dos recursos de transmissão e armazenamento,
através da redução do número de bits necessários para
representar um sinal de vídeo e da conservação da qualidade
do conteúdo que está sendo processado. O objetivo dessa
dissertação de Mestrado é propor um novo algoritmo capaz de
reduzir a grande complexidade computacional envolvida
nestas técnicas, mantendo a qualidade do sinal
reconstruído. Dessa maneira, apresenta-se um algoritmo
AUMHS (Adaptive Unsymmetrical-cross Multi-Hexagon-grid
Search) o qual traz como principais modificações ao
algoritmo UMHS (Unsymmetrical-cross Multi-Hexagon-grid
Search) a implementação de uma medida de movimento que
classifica as cenas de uma seqüência de vídeo de acordo com
o movimento detectado para posterior adequação dos
parâmetros de estimação de movimento e de outros parâmetros
do codificador. Como resultado apresenta-se um ganho
expressivo na velocidade de processamento, e conseqüente
redução do custo computacional, conservando-se a qualidade
obtida pelos principais algoritmos da literatura. O
algoritmo foi implementado no codificador do padrão
H.264/AVC onde realizou-se análises comparativas de
desempenho com os algoritmos UMHS e FSA através da medição
de parâmetros como PSNR (Peak Signal to Noise Ratio), tempo
de processamento do codificador, tempo de processamento do
módulo de estimação de movimento, taxa de bits utilizada e
avaliação subjetiva informal. / [en] The motion estimation techniques used by the video
compression standards provide an efficient utilization of
the transmission and storage resources, through the
reduction of the number of bits required to represent a
video signal and the conservation of the content quality
that is being processed. The objective of this work is to
propose a new algorithm capable of reducing the great
computational complexity involved in the motion estimation
techniques, keeping the quality of the reconstructed
signal. In this way, we present an algorithm called AUMHS
(Adaptive Unsymmetrical-cross Multi-Hexagon-grid Search)
which brings as main modifications relative to the UMHS
(Unsymmetrical-cross Multi-Hexagon-grid Search) the
implementation of a movement measure that can classify the
scenes of a video sequence according to the motion detected
for posterior adequacy of the motion estimation and others
coder parameters. As result we present an expressive gain
in the processing speed, and consequent computational cost
reduction, conserving the same quality of the main
algorithms published in the literature. The algorithm was
implemented in the H.264/AVC coder in order to proceed with
comparative analysis of perfomance together with the UMHS
and FSA algorithms, measuring parameters as PSNR (Peak
Signal you the Noise Ratio), coding processing time, motion
estimation time, bit rate, and informal subjective
evaluation.
|
293 |
Error-robust coding and transformation of compressed hybered hybrid video streams for packet-switched wireless networksHalbach, Till January 2004 (has links)
<p>This dissertation considers packet-switched wireless networks for transmission of variable-rate layered hybrid video streams. Target applications are video streaming and broadcasting services. The work can be divided into two main parts.</p><p>In the first part, a novel quality-scalable scheme based on coefficient refinement and encoder quality constraints is developed as a possible extension to the video coding standard H.264. After a technical introduction to the coding tools of H.264 with the main focus on error resilience features, various quality scalability schemes in previous research are reviewed. Based on this discussion, an encoder decoder framework is designed for an arbitrary number of quality layers, hereby also enabling region-of-interest coding. After that, the performance of the new system is exhaustively tested, showing that the bit rate increase typically encountered with scalable hybrid coding schemes is, for certain coding parameters, only small to moderate. The double- and triple-layer constellations of the framework are shown to perform superior to other systems.</p><p>The second part considers layered code streams as generated by the scheme of the first part. Various error propagation issues in hybrid streams are discussed, which leads to the definition of a decoder quality constraint and a segmentation of the code stream to transmit. A packetization scheme based on successive source rate consumption is drafted, followed by the formulation of the channel code rate optimization problem for an optimum assignment of available codes to the channel packets. Proper MSE-based error metrics are derived, incorporating the properties of the source signal, a terminate-on-error decoding strategy, error concealment, inter-packet dependencies, and the channel conditions. The Viterbi algorithm is presented as a low-complexity solution to the optimization problem, showing a great adaptivity of the joint source channel coding scheme to the channel conditions. An almost constant image qualiity is achieved, also in mismatch situations, while the overall channel code rate decreases only as little as necessary as the channel quality deteriorates. It is further shown that the variance of code distributions is only small, and that the codes are assigned irregularly to all channel packets.</p><p>A double-layer constellation of the framework clearly outperforms other schemes with a substantial margin. </p><p>Keywords — Digital lossy video compression, visual communication, variable bit rate (VBR), SNR scalability, layered image processing, quality layer, hybrid code stream, predictive coding, progressive bit stream, joint source channel coding, fidelity constraint, channel error robustness, resilience, concealment, packet-switched, mobile and wireless ATM, noisy transmission, packet loss, binary symmetric channel, streaming, broadcasting, satellite and radio links, H.264, MPEG-4 AVC, Viterbi, trellis, unequal error protection</p>
|
294 |
Error-robust coding and transformation of compressed hybered hybrid video streams for packet-switched wireless networksHalbach, Till January 2004 (has links)
This dissertation considers packet-switched wireless networks for transmission of variable-rate layered hybrid video streams. Target applications are video streaming and broadcasting services. The work can be divided into two main parts. In the first part, a novel quality-scalable scheme based on coefficient refinement and encoder quality constraints is developed as a possible extension to the video coding standard H.264. After a technical introduction to the coding tools of H.264 with the main focus on error resilience features, various quality scalability schemes in previous research are reviewed. Based on this discussion, an encoder decoder framework is designed for an arbitrary number of quality layers, hereby also enabling region-of-interest coding. After that, the performance of the new system is exhaustively tested, showing that the bit rate increase typically encountered with scalable hybrid coding schemes is, for certain coding parameters, only small to moderate. The double- and triple-layer constellations of the framework are shown to perform superior to other systems. The second part considers layered code streams as generated by the scheme of the first part. Various error propagation issues in hybrid streams are discussed, which leads to the definition of a decoder quality constraint and a segmentation of the code stream to transmit. A packetization scheme based on successive source rate consumption is drafted, followed by the formulation of the channel code rate optimization problem for an optimum assignment of available codes to the channel packets. Proper MSE-based error metrics are derived, incorporating the properties of the source signal, a terminate-on-error decoding strategy, error concealment, inter-packet dependencies, and the channel conditions. The Viterbi algorithm is presented as a low-complexity solution to the optimization problem, showing a great adaptivity of the joint source channel coding scheme to the channel conditions. An almost constant image qualiity is achieved, also in mismatch situations, while the overall channel code rate decreases only as little as necessary as the channel quality deteriorates. It is further shown that the variance of code distributions is only small, and that the codes are assigned irregularly to all channel packets. A double-layer constellation of the framework clearly outperforms other schemes with a substantial margin. Keywords — Digital lossy video compression, visual communication, variable bit rate (VBR), SNR scalability, layered image processing, quality layer, hybrid code stream, predictive coding, progressive bit stream, joint source channel coding, fidelity constraint, channel error robustness, resilience, concealment, packet-switched, mobile and wireless ATM, noisy transmission, packet loss, binary symmetric channel, streaming, broadcasting, satellite and radio links, H.264, MPEG-4 AVC, Viterbi, trellis, unequal error protection
|
295 |
Skalierbares und flexibles Live-Video Streaming mit der Media Internet Streaming ToolboxPranke, Nico 17 November 2009 (has links)
Die Arbeit befasst sich mit der Entwicklung und Anwendung verschiedener Konzepte und Algorithmen zum skalierbaren Live-Streaming von Video sowie deren Umsetzung in der Media Internet Streaming Toolbox. Die Toolbox stellt eine erweiterbare, plattformunabhängige Infrastruktur zur Erstellung aller Teile eines Live-Streamingsystems von der Videogewinnung über die Medienverarbeitung und Codierung bis zum Versand bereit. Im Vordergrund steht die flexible Beschreibung der Medienverarbeitung und Stromerstellung sowie die Erzeugung von klientenindividuellen Stromformaten mit unterschiedlicher Dienstegüte für eine möglichst große Zahl von Klienten und deren Verteilung über das Internet. Es wird ein integriertes graphenbasiertes Konzept entworfen, in dem das Component Encoding Stream Construction, die Verwendung von Compresslets und eine automatisierte Flussgraphenkonstruktion miteinander verknüpft werden. Die für die Stromkonstruktion relevanten Teile des Flussgraphen werden für Gruppen mit identischem Zustand entkoppelt vom Rest ausgeführt. Dies führt zu einer maximalen Rechenlast, die unabhängig von der Zahl der Klienten ist, was sowohl theoretisch gezeigt als auch durch konkrete Messungen bestätigt wird. Als Beispiele für die Verwendung der Toolbox werden unter Anderem zwei waveletbasierte Stromformate entwickelt, integriert und bezüglich Codiereffizienz und Skalierbarkeit miteinander verglichen
|
296 |
Video anatomy : spatial-temporal video profileCai, Hongyuan 31 July 2014 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / A massive amount of videos are uploaded on video websites, smooth video browsing, editing, retrieval, and summarization are demanded. Most of the videos employ several types of camera operations for expanding field of view, emphasizing events, and expressing cinematic effect. To digest heterogeneous videos in video websites and databases, video clips are profiled to 2D image scroll containing both spatial and temporal information for video preview. The video profile is visually continuous, compact, scalable, and indexing to each frame. This work analyzes the camera kinematics including zoom, translation, and rotation, and categorize camera actions as their combinations. An automatic video summarization framework is proposed and developed. After conventional video clip segmentation and video segmentation for smooth camera operations, the global flow field under all camera actions has been investigated for profiling various types of video. A new algorithm has been designed to extract the major flow direction and convergence factor using condensed images. Then this work proposes a uniform scheme to segment video clips and sections, sample video volume across the major flow, compute flow convergence factor, in order to obtain an intrinsic scene space less influenced by the camera ego-motion. The motion blur technique has also been used to render dynamic targets in the profile. The resulting profile of video can be displayed in a video track to guide the access to video frames, help video editing, and facilitate the applications such as surveillance, visual archiving of environment, video retrieval, and online video preview.
|
297 |
A new adaptive trilateral filter for in-loop filteringKesireddy, Akitha January 2014 (has links)
Indiana University-Purdue University Indianapolis (IUPUI) / HEVC has achieved significant coding efficiency improvement beyond existing video coding standard by employing many new coding tools. Deblocking Filter, Sample Adaptive Offset and Adaptive Loop Filter for in-loop filtering are currently introduced for the HEVC standardization. However these filters are implemented in spatial domain despite the fact of temporal correlation within video sequences. To reduce the artifacts and better align object boundaries in video , a new algorithm in in-loop filtering is proposed. The proposed algorithm is implemented in HM-11.0 software. This proposed algorithm allows an average bitrate reduction of about 0.7% and improves the PSNR of the decoded frame by 0.05%, 0.30% and 0.35% in luminance and chroma.
|
298 |
Multimedia Forensics Using MetadataZiyue Xiang (17989381) 21 February 2024 (has links)
<p dir="ltr">The rapid development of machine learning techniques makes it possible to manipulate or synthesize video and audio information while introducing nearly indetectable artifacts. Most media forensics methods analyze the high-level data (e.g., pixels from videos, temporal signals from audios) decoded from compressed media data. Since media manipulation or synthesis methods usually aim to improve the quality of such high-level data directly, acquiring forensic evidence from these data has become increasingly challenging. In this work, we focus on media forensics techniques using the metadata in media formats, which includes container metadata and coding parameters in the encoded bitstream. Since many media manipulation and synthesis methods do not attempt to hide metadata traces, it is possible to use them for forensics tasks. First, we present a video forensics technique using metadata embedded in MP4/MOV video containers. Our proposed method achieved high performance in video manipulation detection, source device attribution, social media attribution, and manipulation tool identification on publicly available datasets. Second, we present a transformer neural network based MP3 audio forensics technique using low-level codec information. Our proposed method can localize multiple compressed segments in MP3 files. The localization accuracy of our proposed method is higher compared to other methods. Third, we present an H.264-based video device matching method. This method can determine if the two video sequences are captured by the same device even if the method has never encountered the device. Our proposed method achieved good performance in a three-fold cross validation scheme on a publicly available video forensics dataset containing 35 devices. Fourth, we present a Graph Neural Network (GNN) based approach for the analysis of MP4/MOV metadata trees. The proposed method is trained using Self-Supervised Learning (SSL), which increased the robustness of the proposed method and makes it capable of handling missing/unseen data. Fifth, we present an efficient approach to compute the spectrogram feature with MP3 compressed audio signals. The proposed approach decreases the complexity of speech feature computation by ~77.6% and saves ~37.87% of MP3 decoding time. The resulting spectrogram features lead to higher synthetic speech detection performance.</p>
|
Page generated in 0.09 seconds