Global ETD Search

291	On Enhancement and Quality Assessment of Audio and Video in Communication Systems Rossholm, Andreas January 2014 (has links) The use of audio and video communication has increased exponentially over the last decade and has gone from speech over GSM to HD resolution video conference between continents on mobile devices. As the use becomes more widespread the interest in delivering high quality media increases even on devices with limited resources. This includes both development and enhancement of the communication chain but also the topic of objective measurements of the perceived quality. The focus of this thesis work has been to perform enhancement within speech encoding and video decoding, to measure influence factors of audio and video performance, and to build methods to predict the perceived video quality. The audio enhancement part of this thesis addresses the well known problem in the GSM system with an interfering signal generated by the switching nature of TDMA cellular telephony. Two different solutions are given to suppress such interference internally in the mobile handset. The first method involves the use of subtractive noise cancellation employing correlators, the second uses a structure of IIR notch filters. Both solutions use control algorithms based on the state of the communication between the mobile handset and the base station. The video enhancement part presents two post-filters. These two filters are designed to improve visual quality of highly compressed video streams from standard, block-based video codecs by combating both blocking and ringing artifacts. The second post-filter also performs sharpening. The third part addresses the problem of measuring audio and video delay as well as skewness between these, also known as synchronization. This method is a black box technique which enables it to be applied on any audiovisual application, proprietary as well as open standards, and can be run on any platform and over any network connectivity. The last part addresses no-reference (NR) bitstream video quality prediction using features extracted from the coded video stream. Several methods have been used and evaluated: Multiple Linear Regression (MLR), Artificial Neural Network (ANN), and Least Square Support Vector Machines (LS-SVM), showing high correlation with both MOS and objective video assessment methods as PSNR and PEVQ. The impact from temporal, spatial and quantization variations on perceptual video quality has also been addressed, together with the trade off between these, and for this purpose a set of locally conducted subjective experiments were performed. QoE video quality assessment video quality metric multi-linear regression artificial neural network support vector machine quality predictor machine learning temporal scaling spatial scaling video compression deblocking filter noise cancelling synchronization audio delay video delay GSM interference signal noise cancellation notch filtering
292	[en] FAST MOTION ADAPTIVE ESTIMATION ALGORITHM APPLIED TO THE H.261/AVC STANDARD CODER / [pt] ALGORITMO RÁPIDO DE ESTIMAÇÃO ADAPTATIVO AO MOVIMENTO APLICADO AO CODIFICADOR PADRÃO H.264/AVC GUILHERME MACHADO GOEHRINGER 31 March 2008 (has links) [pt] As técnicas de estimação de movimento utilizadas nos padrões de compressão de vídeo proporcionam a utilização mais eficiente dos recursos de transmissão e armazenamento, através da redução do número de bits necessários para representar um sinal de vídeo e da conservação da qualidade do conteúdo que está sendo processado. O objetivo dessa dissertação de Mestrado é propor um novo algoritmo capaz de reduzir a grande complexidade computacional envolvida nestas técnicas, mantendo a qualidade do sinal reconstruído. Dessa maneira, apresenta-se um algoritmo AUMHS (Adaptive Unsymmetrical-cross Multi-Hexagon-grid Search) o qual traz como principais modificações ao algoritmo UMHS (Unsymmetrical-cross Multi-Hexagon-grid Search) a implementação de uma medida de movimento que classifica as cenas de uma seqüência de vídeo de acordo com o movimento detectado para posterior adequação dos parâmetros de estimação de movimento e de outros parâmetros do codificador. Como resultado apresenta-se um ganho expressivo na velocidade de processamento, e conseqüente redução do custo computacional, conservando-se a qualidade obtida pelos principais algoritmos da literatura. O algoritmo foi implementado no codificador do padrão H.264/AVC onde realizou-se análises comparativas de desempenho com os algoritmos UMHS e FSA através da medição de parâmetros como PSNR (Peak Signal to Noise Ratio), tempo de processamento do codificador, tempo de processamento do módulo de estimação de movimento, taxa de bits utilizada e avaliação subjetiva informal. / [en] The motion estimation techniques used by the video compression standards provide an efficient utilization of the transmission and storage resources, through the reduction of the number of bits required to represent a video signal and the conservation of the content quality that is being processed. The objective of this work is to propose a new algorithm capable of reducing the great computational complexity involved in the motion estimation techniques, keeping the quality of the reconstructed signal. In this way, we present an algorithm called AUMHS (Adaptive Unsymmetrical-cross Multi-Hexagon-grid Search) which brings as main modifications relative to the UMHS (Unsymmetrical-cross Multi-Hexagon-grid Search) the implementation of a movement measure that can classify the scenes of a video sequence according to the motion detected for posterior adequacy of the motion estimation and others coder parameters. As result we present an expressive gain in the processing speed, and consequent computational cost reduction, conserving the same quality of the main algorithms published in the literature. The algorithm was implemented in the H.264/AVC coder in order to proceed with comparative analysis of perfomance together with the UMHS and FSA algorithms, measuring parameters as PSNR (Peak Signal you the Noise Ratio), coding processing time, motion estimation time, bit rate, and informal subjective evaluation. [pt] PROCESSAMENTO DIGITAL DE IMAGENS [en] DIGITAL PROCESSING OF IMAGES [pt] PADROES DE COMPRESSAO DE VIDEO [en] VIDEO COMPRESSION STANDARDS [en] MOTION ESTIMATION ALGORITHMS [pt] CODIFICADORES DE VIDEO [en] VIDEO CODERS [pt] SISTEMAS DE TV DIGITAL [en] DIGITAL TV SYSTEMS [pt] AVALIACAO SUBJETIVA [en] SUBJECTIVE EVALUATION [pt] AVALIACAO OBJETIVA [en] OBJECTIVE EVALUATION
293	Error-robust coding and transformation of compressed hybered hybrid video streams for packet-switched wireless networks Halbach, Till January 2004 (has links) <p>This dissertation considers packet-switched wireless networks for transmission of variable-rate layered hybrid video streams. Target applications are video streaming and broadcasting services. The work can be divided into two main parts.</p><p>In the first part, a novel quality-scalable scheme based on coefficient refinement and encoder quality constraints is developed as a possible extension to the video coding standard H.264. After a technical introduction to the coding tools of H.264 with the main focus on error resilience features, various quality scalability schemes in previous research are reviewed. Based on this discussion, an encoder decoder framework is designed for an arbitrary number of quality layers, hereby also enabling region-of-interest coding. After that, the performance of the new system is exhaustively tested, showing that the bit rate increase typically encountered with scalable hybrid coding schemes is, for certain coding parameters, only small to moderate. The double- and triple-layer constellations of the framework are shown to perform superior to other systems.</p><p>The second part considers layered code streams as generated by the scheme of the first part. Various error propagation issues in hybrid streams are discussed, which leads to the definition of a decoder quality constraint and a segmentation of the code stream to transmit. A packetization scheme based on successive source rate consumption is drafted, followed by the formulation of the channel code rate optimization problem for an optimum assignment of available codes to the channel packets. Proper MSE-based error metrics are derived, incorporating the properties of the source signal, a terminate-on-error decoding strategy, error concealment, inter-packet dependencies, and the channel conditions. The Viterbi algorithm is presented as a low-complexity solution to the optimization problem, showing a great adaptivity of the joint source channel coding scheme to the channel conditions. An almost constant image qualiity is achieved, also in mismatch situations, while the overall channel code rate decreases only as little as necessary as the channel quality deteriorates. It is further shown that the variance of code distributions is only small, and that the codes are assigned irregularly to all channel packets.</p><p>A double-layer constellation of the framework clearly outperforms other schemes with a substantial margin. </p><p>Keywords — Digital lossy video compression, visual communication, variable bit rate (VBR), SNR scalability, layered image processing, quality layer, hybrid code stream, predictive coding, progressive bit stream, joint source channel coding, fidelity constraint, channel error robustness, resilience, concealment, packet-switched, mobile and wireless ATM, noisy transmission, packet loss, binary symmetric channel, streaming, broadcasting, satellite and radio links, H.264, MPEG-4 AVC, Viterbi, trellis, unequal error protection</p> Digital lossy video compression visual communication variable bit rate (VBR) SNR scalability layered image processing quality layer hybrid code stream predictive coding progressive bit stream joint source channel coding fidelity constraint channel error robustness resilience concealment packet-switched mobile and wireless ATM noisy transmission packet loss binary symmetric channel streaming broadcasting satellite and radio links H.264 MPEG-4 AVC Viterbi trellis unequal error protectionr
294	Error-robust coding and transformation of compressed hybered hybrid video streams for packet-switched wireless networks Halbach, Till January 2004 (has links) This dissertation considers packet-switched wireless networks for transmission of variable-rate layered hybrid video streams. Target applications are video streaming and broadcasting services. The work can be divided into two main parts. In the first part, a novel quality-scalable scheme based on coefficient refinement and encoder quality constraints is developed as a possible extension to the video coding standard H.264. After a technical introduction to the coding tools of H.264 with the main focus on error resilience features, various quality scalability schemes in previous research are reviewed. Based on this discussion, an encoder decoder framework is designed for an arbitrary number of quality layers, hereby also enabling region-of-interest coding. After that, the performance of the new system is exhaustively tested, showing that the bit rate increase typically encountered with scalable hybrid coding schemes is, for certain coding parameters, only small to moderate. The double- and triple-layer constellations of the framework are shown to perform superior to other systems. The second part considers layered code streams as generated by the scheme of the first part. Various error propagation issues in hybrid streams are discussed, which leads to the definition of a decoder quality constraint and a segmentation of the code stream to transmit. A packetization scheme based on successive source rate consumption is drafted, followed by the formulation of the channel code rate optimization problem for an optimum assignment of available codes to the channel packets. Proper MSE-based error metrics are derived, incorporating the properties of the source signal, a terminate-on-error decoding strategy, error concealment, inter-packet dependencies, and the channel conditions. The Viterbi algorithm is presented as a low-complexity solution to the optimization problem, showing a great adaptivity of the joint source channel coding scheme to the channel conditions. An almost constant image qualiity is achieved, also in mismatch situations, while the overall channel code rate decreases only as little as necessary as the channel quality deteriorates. It is further shown that the variance of code distributions is only small, and that the codes are assigned irregularly to all channel packets. A double-layer constellation of the framework clearly outperforms other schemes with a substantial margin. Keywords — Digital lossy video compression, visual communication, variable bit rate (VBR), SNR scalability, layered image processing, quality layer, hybrid code stream, predictive coding, progressive bit stream, joint source channel coding, fidelity constraint, channel error robustness, resilience, concealment, packet-switched, mobile and wireless ATM, noisy transmission, packet loss, binary symmetric channel, streaming, broadcasting, satellite and radio links, H.264, MPEG-4 AVC, Viterbi, trellis, unequal error protection Digital lossy video compression visual communication variable bit rate (VBR) SNR scalability layered image processing quality layer hybrid code stream predictive coding progressive bit stream joint source channel coding fidelity constraint channel error robustness resilience concealment packet-switched mobile and wireless ATM noisy transmission packet loss binary symmetric channel streaming broadcasting satellite and radio links H.264 MPEG-4 AVC Viterbi trellis unequal error protectionr
295	Skalierbares und flexibles Live-Video Streaming mit der Media Internet Streaming Toolbox Pranke, Nico 17 November 2009 (has links) Die Arbeit befasst sich mit der Entwicklung und Anwendung verschiedener Konzepte und Algorithmen zum skalierbaren Live-Streaming von Video sowie deren Umsetzung in der Media Internet Streaming Toolbox. Die Toolbox stellt eine erweiterbare, plattformunabhängige Infrastruktur zur Erstellung aller Teile eines Live-Streamingsystems von der Videogewinnung über die Medienverarbeitung und Codierung bis zum Versand bereit. Im Vordergrund steht die flexible Beschreibung der Medienverarbeitung und Stromerstellung sowie die Erzeugung von klientenindividuellen Stromformaten mit unterschiedlicher Dienstegüte für eine möglichst große Zahl von Klienten und deren Verteilung über das Internet. Es wird ein integriertes graphenbasiertes Konzept entworfen, in dem das Component Encoding Stream Construction, die Verwendung von Compresslets und eine automatisierte Flussgraphenkonstruktion miteinander verknüpft werden. Die für die Stromkonstruktion relevanten Teile des Flussgraphen werden für Gruppen mit identischem Zustand entkoppelt vom Rest ausgeführt. Dies führt zu einer maximalen Rechenlast, die unabhängig von der Zahl der Klienten ist, was sowohl theoretisch gezeigt als auch durch konkrete Messungen bestätigt wird. Als Beispiele für die Verwendung der Toolbox werden unter Anderem zwei waveletbasierte Stromformate entwickelt, integriert und bezüglich Codiereffizienz und Skalierbarkeit miteinander verglichen info:eu-repo/classification/ddc/004 ddc:004
296	Video anatomy : spatial-temporal video profile Cai, Hongyuan 31 July 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / A massive amount of videos are uploaded on video websites, smooth video browsing, editing, retrieval, and summarization are demanded. Most of the videos employ several types of camera operations for expanding field of view, emphasizing events, and expressing cinematic effect. To digest heterogeneous videos in video websites and databases, video clips are profiled to 2D image scroll containing both spatial and temporal information for video preview. The video profile is visually continuous, compact, scalable, and indexing to each frame. This work analyzes the camera kinematics including zoom, translation, and rotation, and categorize camera actions as their combinations. An automatic video summarization framework is proposed and developed. After conventional video clip segmentation and video segmentation for smooth camera operations, the global flow field under all camera actions has been investigated for profiling various types of video. A new algorithm has been designed to extract the major flow direction and convergence factor using condensed images. Then this work proposes a uniform scheme to segment video clips and sections, sample video volume across the major flow, compute flow convergence factor, in order to obtain an intrinsic scene space less influenced by the camera ego-motion. The motion blur technique has also been used to render dynamic targets in the profile. The resulting profile of video can be displayed in a video track to guide the access to video frames, help video editing, and facilitate the applications such as surveillance, visual archiving of environment, video retrieval, and online video preview. camera motion understanding major flow mosaicing profile of video spatial-temporal synthesis video indexing Information display systems Digital video -- Research Digital cameras Video compression Multimedia systems Image analysis Video surveillance Computer algorithms Human-computer interaction Image transmission Visual perception Automatic abstracting Mechatronics Pattern recognition systems Content-based image retrieval
297	A new adaptive trilateral filter for in-loop filtering Kesireddy, Akitha January 2014 (has links) Indiana University-Purdue University Indianapolis (IUPUI) / HEVC has achieved significant coding efficiency improvement beyond existing video coding standard by employing many new coding tools. Deblocking Filter, Sample Adaptive Offset and Adaptive Loop Filter for in-loop filtering are currently introduced for the HEVC standardization. However these filters are implemented in spatial domain despite the fact of temporal correlation within video sequences. To reduce the artifacts and better align object boundaries in video , a new algorithm in in-loop filtering is proposed. The proposed algorithm is implemented in HM-11.0 software. This proposed algorithm allows an average bitrate reduction of about 0.7% and improves the PSNR of the decoded frame by 0.05%, 0.30% and 0.35% in luminance and chroma. MPEG (Video coding standard) -- Research Digital video -- Standards -- Research Image processing -- Digital techniques Decoders (Electronics) Coding theory Algorithms Transformations (Mathematics) Electrical engineering -- Mathematics
298	Multimedia Forensics Using Metadata Ziyue Xiang (17989381) 21 February 2024 (has links) <p dir="ltr">The rapid development of machine learning techniques makes it possible to manipulate or synthesize video and audio information while introducing nearly indetectable artifacts. Most media forensics methods analyze the high-level data (e.g., pixels from videos, temporal signals from audios) decoded from compressed media data. Since media manipulation or synthesis methods usually aim to improve the quality of such high-level data directly, acquiring forensic evidence from these data has become increasingly challenging. In this work, we focus on media forensics techniques using the metadata in media formats, which includes container metadata and coding parameters in the encoded bitstream. Since many media manipulation and synthesis methods do not attempt to hide metadata traces, it is possible to use them for forensics tasks. First, we present a video forensics technique using metadata embedded in MP4/MOV video containers. Our proposed method achieved high performance in video manipulation detection, source device attribution, social media attribution, and manipulation tool identification on publicly available datasets. Second, we present a transformer neural network based MP3 audio forensics technique using low-level codec information. Our proposed method can localize multiple compressed segments in MP3 files. The localization accuracy of our proposed method is higher compared to other methods. Third, we present an H.264-based video device matching method. This method can determine if the two video sequences are captured by the same device even if the method has never encountered the device. Our proposed method achieved good performance in a three-fold cross validation scheme on a publicly available video forensics dataset containing 35 devices. Fourth, we present a Graph Neural Network (GNN) based approach for the analysis of MP4/MOV metadata trees. The proposed method is trained using Self-Supervised Learning (SSL), which increased the robustness of the proposed method and makes it capable of handling missing/unseen data. Fifth, we present an efficient approach to compute the spectrogram feature with MP3 compressed audio signals. The proposed approach decreases the complexity of speech feature computation by ~77.6% and saves ~37.87% of MP3 decoding time. The resulting spectrogram features lead to higher synthetic speech detection performance.</p> Audio processing Computer vision Image and video coding Image processing Pattern recognition Video processing Digital forensics Deep learning Deepfake detection Digital forensics Video forensics Audio forensics Video metadata Audio metadata H.264 MP3 MP4 Video manipulation detection Video compression Audio compression Decision tree Deep learning Dimensionality reduction Spectrogram Graph neural networks Neural networks Transformer neural networks

Search results