• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 181
  • 56
  • 9
  • 9
  • 6
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • Tagged with
  • 288
  • 288
  • 89
  • 82
  • 80
  • 72
  • 47
  • 46
  • 43
  • 41
  • 41
  • 37
  • 36
  • 33
  • 33
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
171

A hybrid scheme for low-bit rate stereo image compression

Jiang, Jianmin, Edirisinghe, E.A. 29 May 2009 (has links)
No / We propose a hybrid scheme to implement an object driven, block based algorithm to achieve low bit-rate compression of stereo image pairs. The algorithm effectively combines the simplicity and adaptability of the existing block based stereo image compression techniques with an edge/contour based object extraction technique to determine appropriate compression strategy for various areas of the right image. Unlike the existing object-based coding such as MPEG-4 developed in the video compression community, the proposed scheme does not require any additional shape coding. Instead, the arbitrary shape is reconstructed by the matching object inside the left frame, which has been encoded by standard JPEG algorithm and hence made available at the decoding end for those shapes in right frames. Yet the shape reconstruction for right objects incurs no distortion due to the unique correlation between left and right frames inside stereo image pairs and the nature of the proposed hybrid scheme. Extensive experiments carried out support that significant improvements of up to 20% in compression ratios are achieved by the proposed algorithm in comparison with the existing block-based technique, while the reconstructed image quality is maintained at a competitive level in terms of both PSNR values and visual inspections
172

Design and Implementation of An Emulation Testbed for Video Communications in Ad Hoc Networks

Wang, Xiaojun 09 February 2006 (has links)
Video communication is an important application in wireless ad hoc network environment. Although current off-the-shelf video communication software would work for ad hoc network operating under stable conditions (e.g., extremely low link and node failures), video communications for ad hoc network operating under extreme conditions remain a challenging problem. This is because traditional video codec, either single steam or layered video, requires at least one relatively stable path between source and destination nodes. Recent advances in multiple description (MD) video coding have opened up new possibilities to offer video communications over ad hoc networks. In this thesis, we perform a systematic study on MD video for ad hoc networks. The theoretical foundation of this research is based on an application-centric approach to formulate a cross-layer multipath routing problem that minimizes the application layer video distortion. The solution procedure to this complex optimization problem is based on the so-called Genetic Algorithm (GA). The theoretical results have been documented in [7] and will be reviewed in Chapter 2. Although the theoretical foundation for MD video over dynamic ad hoc networks has been laid, there remains a lot of skepticisms in the research community on whether such cross-layer optimal routing can be implemented in practice. To fill this gap, this thesis is devoted to the experimental research (or proof-of-concept) for the work in [7]. Our approach is to design and implement an emulation testbed where we can actually implement the ideas and algorithms proposed in [7] in a controlled laboratory setting. The highlights of our experimental research include: 1. A testbed that emulates three properties of a wireless ad hoc network: topology, link success probability, and link bandwidth; 2. A source routing implementation that can easily support comparative study between the proposed GA-based routing with other routing schemes under different network conditions; 3. A modified H.263+ video codec that employs Unequal Error Protection (UEP) approach to generate MD video; 4. Implementation of three experiments that • compared the GA-based routing with existing technologies (NetMeeting video conferencing plus AODV routing); • compared our GA-based routing with network-centric routing schemes (two-disjoint paths routing); • proved that our approach has great potential in supporting video communications in wireless ad hoc networks. 5. Experimental results that show the proposed cross-layer optimization significantly outperforms the current off-the-shelf technologies, and that the proposed cross-layer optimization provides much better performance than network-centric routing schemes in supporting routing of MD video. In summary, the experimental research in this thesis has demonstrated that a cross-layer multipath routing algorithm can be practically implemented in a dynamic ad hoc network to support video communications. / Master of Science
173

Fast-forward functions on parallel video servers

Ding, Zhiyong 01 January 1999 (has links)
No description available.
174

Objective video quality analysis of MPEG-1, MPEG-2, and Windows Media video formats

Aeluri, Praveen 01 July 2003 (has links)
No description available.
175

Perceptual Criterion Based Rate Control And Fast Mode Search For Spatial Intra Prediction In Video Coding

Nagori, Soyeb 05 1900 (has links)
This thesis dwells on two important problems in the field of video coding; namely rate control and spatial domain intra prediction. While the former is applicable generally to most video compression standards, the latter applies to recent advanced video compression standards such as H.264, VC1 and AVS. Rate control regulates the instantaneous video bit-rate to maximize a picture quality metric while satisfying channel rate and buffer size constraints. Rate control has an important bearing on the picture quality of encoded video. Typically, a quality metric such as Peak Signal-to-Noise ratio (PSNR) or weighted signal-to-noise ratio (WSNR) is chosen out of convenience. However neither metric is a true measure of perceived video quality. A few researchers have attempted to derive rate control algorithms with the combination of standard PSNR and ad-hoc perceptual metrics of video quality. The concept of using perceptual criterion for video coding was introduced in [7] within the context of perceptual adaptive quantization. In this work, quantization noise levels were adjusted such that more noise was allowed where it was less visible (busy and textured areas) while sensitive areas (typically flat and low detail regions) were finely quantized. Macro–blocks were classified into low detail, texture and edge areas depending on a classifier that studied the variance of sub-blocks within a macro-block (MB). The Rate models were trained from training sets of pre -classified video. One drawback of the above scheme as with standard PSNR was that neither accounts for the perceptual effect of motion. The work in [8] achieved this by assigning higher weights to the regions of the image that were experiencing the highest motion. Also, the center of the image and objects in the foreground are perceived as more important than the sides. However, attempts to use perceptual metrics for video quality have been limited by the accuracy of the video quality metrics chosen. In the recent years, new and improved metrics of subjective quality have been invented and their statistical accuracy has been studied in a formal manner. Particularly interesting is the work undertaken by ITU and the Video quality experts group (VQEG). VQEG conducted two phases of testing; in the first pha se, several algorithms were tested but they were not found to be very accurate, in fact none were found to be any more accurate than PSNR based metric. In the second phase of testing a few years later, a few new algorithms were experimented with, and it wa s concluded that four of these did achieve results good enough to warrant their standardization as a part of ITU –T Recommendation J.144. These experiments are referred to as the FR-TV (Full Reference Television) phase-II evaluations. ITU-T J.144 does not explicitly identify a single algorithm but provides guidelines on the selection of appropriate techniques to objectively measure subjective video quality. It describes four reference algorithms as well as PSNR. Amongst the four, the NTIA General Video Quality Model (VQM), [11] is the best performing and has been adopted by American National Standards Institute (ANSI) as a North American standard T1.801.03. NTIA’s approach has been to focus on defining parameters that model how humans perceive video quality. These parameters have been combined using linear models to produce estimates of video quality that closely approximate subjective test results. NTIA General Video Quality Model (VQM) has been proven to have strong correlation with subjective quality. In the first part of the thesis, we apply metrics motivated by NTIA-VQM model within a rate control algorithm to maximize perceptual video quality. We derive perceptual weights using key NTIA parameters to influence QP value used to decide degree of quantization. Our experiments demonstrate that a perceptual quality motivated standard TMN8 rate control in an H.263 encoder results in perceivable quality improvements over a baseline TMN8 rate control algorithm that uses a PSNR metric. Our experimental results on a set of 11 sequences show on an average reduction of 6% in bitrate using the proposed algorithm for the same perceptual quality as standard TMN-8. The second part of our thesis work deals with spatial domain intra prediction used in advance video coding standard such as H.264. The H.264 Advanced Video coding standard [36] has been shown to achieve video quality similar to older standards such as MPEG2 and H.263 at nearly half the bit-rate. Generally, this compression improvement is attributed to several new tools that were introduced in H.264 – including spatial intra prediction, adaptive block size for motion compensation, in-loop de-blocking filter, context adaptive binary arithmetic coding (CABAC), and multiple reference frames. While the new tools allow better coding efficiency, they also introduce additi onal computational complexity at both encoder and decoder ends. We are especially concerned here on the impact of Intra prediction on the computational complexity of the encoder. H.264 reference implementations such as JM [29] search through all allowed intra-rediction “modes” in order to find the optimal mode. While this approach yields the optimal prediction mode, it comes at an extremely heavy computational cost. Hence there is a lot of interest into well -motivated algorithms that reduce the computational complexity of the search for the best prediction mode, while retaining the quality advantages of full-search Intra4x4. We propose a novel algorithm to reduce the complexity of full search by exploiting our knowledge of the source statistics. Specifically, we analyze the transform domain energy distribution of the original 4x4 block in different directions and use the results of our analysis to eliminate unlikely modes and reduce the search space for the optimal I ntra mode. Experimental results show that the proposed algorithm achieves quality metrics (PSNR) similar to full search at nearly a third of the complexity. This thesis has four chapters and is organized as follows, in the first chapter we introduce basics of video encoding and subsequently present exiting work in the area of perceptual rate control and introduce TMN-8 rate control algorithm in brief. At the end we introduce spatial domain intra prediction. In the second chapter we explain the challenges present in combining NTIA perceptual parameters with TMN8 rate control algorithm. We examine perceptual features used by NTIA from a video compression perspective and explain how the perceptual metrics capture typical compression artifacts. We next present a two pass perceptual rate control (PRCII) algorithm. Finally, we list experimental results on set of video sequences showing on an average of 6% bit-rate reduction by using PRC-II rate control over standard TMN-8 rate control. Chapter 3 contains part-II of our thesis work on, spatial domain intra prediction . We start by reviewing existing work in intra prediction and then present the details of our proposed intra prediction algorithm and experimental results. We finally conclude this thesis in chapter 4 and discuss direction for the future work on both our proposed algorithms.
176

FPGA Prototyping of a Watermarking Algorithm for MPEG-4

Cai, Wei 05 1900 (has links)
In the immediate future, multimedia product distribution through the Internet will become main stream. However, it can also have the side effect of unauthorized duplication and distribution of multimedia products. That effect could be a critical challenge to the legal ownership of copyright and intellectual property. Many schemes have been proposed to address these issues; one is digital watermarking which is appropriate for image and video copyright protection. Videos distributed via the Internet must be processed by compression for low bit rate, due to bandwidth limitations. The most widely adapted video compression standard is MPEG-4. Discrete cosine transform (DCT) domain watermarking is a secure algorithm which could survive video compression procedures and, most importantly, attacks attempting to remove the watermark, with a visibly degraded video quality result after the watermark attacks. For a commercial broadcasting video system, real-time response is always required. For this reason, an FPGA hardware implementation is studied in this work. This thesis deals with video compression, watermarking algorithms and their hardware implementation with FPGAs. A prototyping VLSI architecture will implement video compression and watermarking algorithms with the FPGA. The prototype is evaluated with video and watermarking quality metrics. Finally, it is seen that the video qualities of the watermarking at the uncompressed vs. the compressed domain are only 1dB of PSNR lower. However, the cost of compressed domain watermarking is the complexity of drift compensation for canceling the drifting effect.
177

Multimedia data dissemination in opportunistic systems / Diffusion multimédia de données dans des systèmes opportunistes

Klaghstan, Merza 01 December 2016 (has links)
Les réseaux opportunistes sont des réseaux mobiles qui se forment spontanément et de manière dynamique grâce à un ensemble d'utilisateurs itinérants dont le nombre et le déplacement ne sont pas prévisibles. En conséquence, la topologie et la densité de tels réseaux évoluent sans cesse. La diffusion de bout-en-bout d'informations, dans ce contexte, est incertaine du fait de la forte instabilité des liens réseaux point à point entre les utilisateurs. Les travaux qui en ont envisagé l'usage visent pour la plupart des applications impliquant l'envoi de message de petite taille. Cependant, la transmission de données volumineuses telles que les vidéos représente une alternative très pertinente aux réseaux d'infrastructure, en cas d'absence de réseau, de coût important ou pour éviter la censure d'un contenu. La diffusion des informations de grande taille en général et de vidéos en particulier dans des réseaux oppnets constitue un challenge important. En effet, permettre, dans un contexte réseau très incertain et instable, au destinataire d’une vidéo de prendre connaissance au plus vite du contenu de celle-ci, avec la meilleure qualité de lecture possible et en encombrant le moins possible le réseau reste un problème encore très largement ouvert. Dans cette thèse, nous proposons un nouveau mécanisme de diffusion de vidéos dans un réseau opportuniste de faible densité, visant à améliorer le temps d'acheminement de la vidéo tout en réduisant le délai de lecture à destination. La solution proposée se base sur le choix d'encoder la vidéo en utilisant l'encodage SVC, grâce auquel la vidéo se décline en un ensemble de couches interdépendantes (layers), chacune améliorant la précédente soit en terme de résolution, soit en terme de densité, soit en terme de perception visuelle. Notre solution se décline en trois contributions. La première consiste à proposer une adaptation du mécanisme de diffusion Spray-and-Wait, avec comme unités de diffusion, les couches produites par SVC. Les couches sont ainsi diffusées avec un niveau de redondance propre à chacune, adapté à leur degré d'importance dans la diffusion de la vidéo. Notre seconde contribution consiste à améliorer le mécanisme précédent en prenant en compte une granularité plus fine et adaptative en fonction de l'évolution de la topologie du réseau. Cette amélioration a la particularité de ne pas engendrer de coût de partitionnement, les couches vidéos dans l'encodage SVC étant naturellement déclinées en petites unités (NALU) à base desquelles l'unité de transfert sera calculée. Enfin, la troisième contribution de cette thèse consiste à proposer un mécanisme hybride de complétion des couches vidéos arrivées incomplètes à destination. Cette méthode se caractérise par le fait d'être initiée par le destinataire. Elle combine un protocole de demande des parties manquantes aux usagers proches dans le réseau et des techniques de complétion de vidéo à base d’opérations sur les frames constituant la vidéo. / Opportunistic networks are human-centric mobile ad-hoc networks, in which neither the topology nor the participating nodes are known in advance. Routing is dynamically planned following the store-carry-and-forward paradigm, which takes advantage of people mobility. This widens the range of communication and supports indirect end-to-end data delivery. But due to individuals’ mobility, OppNets are characterized by frequent communication disruptions and uncertain data delivery. Hence, these networks are mostly used for exchanging small messages like disaster alarms or traffic notifications. Other scenarios that require the exchange of larger data are still challenging due to the characteristics of this kind of networks. However, there are still multimedia sharing scenarios where a user might need switching to an ad-hoc alternative. Examples are the cases of 1) absence of infrastructural networks in far rural areas, 2) high costs due limited data volumes or 3) undesirable censorship by third parties while exchanging sensitive content. Consequently, we target in this thesis a video dissemination scheme in OppNets. For the video delivery problem in the sparse opportunistic networks, we propose a solution that encloses three contributions. The first one is given by granulating the videos at the source node into smaller parts, and associating them with unequal redundancy degrees. This is technically based on using the Scalable Video Coding (SVC), which encodes a video into several layers of unequal importance for viewing the content at different quality levels. Layers are routed using the Spray-and-Wait routing protocol, with different redundancy factors for the different layers depending on their importance degree. In this context as well, a video viewing QoE metric is proposed, which takes the values of the perceived video quality, delivery delay and network overhead into consideration, and on a scalable basis. Second, we take advantage of the small units of the Network Abstraction Layer (NAL), which compose SVC layers. NAL units are packetized together under specific size constraints to optimize granularity. Packets sizes are tuned in an adaptive way, with regard to the dynamic network conditions. Each node is enabled to record a history of environmental information regarding the contacts and forwarding opportunities, and use this history to predict future opportunities and optimize the sizes accordingly. Lastly, the receiver node is pushed into action by reacting to missing data parts in a composite backward loss concealment mechanism. So, the receiver asks first for the missing data from other nodes in the network in the form of request-response. Then, since the transmission is concerned with video content, video frame loss error concealment techniques are also exploited at the receiver side. Consequently, we propose to combine the two techniques in the loss concealment mechanism, which is enabled then to react to missing data parts.
178

ARMOR - Adjusting Repair and Media Scaling with Operations Research for Streaming Video

Wu, Huahui 04 May 2006 (has links)
Streaming multimedia quality is impacted by two main factors: capacity constraint and packet loss. To match the capacity constraint while preserving real-time playout, media scaling can be used to discard the encoded multimedia content that has the least impact on perceived video quality. To limit the impact of lost packets, repair techniques, e.g. forward error correction (FEC), can be used to repair frames damaged by packet loss. However, adding data to facilitate repair requires further reduction of the original multimedia data, making the decision of how much repair data to use of critical importance. Assuming a limited network capacity and the availability of an estimate of the current packet loss rate along a flow path, selecting the best distribution of FEC packets for video frames with inherent interframe encoding dependencies can be cast as a constraint optimization problem that attempts to optimize the quality of the video stream. This thesis presents an Adjusting Repair and Media scaling with Operations Research (ARMOR) system. An analytical model is derived for streaming video with FEC and media scaling. Given parameters to represent network loss as well as video frame types and sizes, if the number of FEC packets per video frame type and media scaling pattern is specified, the model can estimate the video quality at the receiver side. The model is then used in an operations research algorithm to adjust the FEC strength and media scaling level to yield the best quality under the capacity constraint. Four different combinations of FEC type and media scaling method are studied: Media Independent FEC with Temporal Scaling (MITS), Media Independent FEC with Quality Scaling (MIQS), Media Independent FEC with Temporal and Quality Scaling (MITQS), and Media Dependent FEC with Quality Scaling (MDQS). The analytical experiments show: 1) adjusting FEC always achieves a higher video quality than streaming video without FEC or with a fixed amount of FEC; 2) Quality Scaling usually works better than Temporal Scaling; and 3) Media Dependent FEC (MDFEC) is typically less effective than Media Independent FEC (MIFEC). A user study is presented with results from 74 participants analysis shows that the ARMOR model can accurately estimate users¡¯perceptual quality. Well-designed simulations and a realistic system implementation suggests the ARMOR system can practically improve the quality of streaming video.
179

Caracterização energética da codificação de vídeo de alta eficiência (HEVC) em processador de propósito geral / Energy characterization of high efficiency video coding (HEVC) in general purpose processor

Monteiro, Eduarda Rodrigues January 2017 (has links)
A popularização das aplicações que manipulam vídeos digitais de altas resoluções incorpora diversos desafios no desenvolvimento de novas e eficientes técnicas para manter a eficiência na compressão de vídeo. Para lidar com esta demanda, o padrão HEVC foi proposto com o objetivo de duplicar as taxas de compressão quando comparado com padrões predecessores. No entanto, para atingir esta meta, o HEVC impõe um elevado custo computacional e, consequentemente, o aumento no consumo de energia. Este cenário torna-se ainda mais preocupante quando considerados dispositivos móveis alimentados por bateria os quais apresentam restrições computacionais no processamento de aplicações multimídia. A maioria dos trabalhos relacionados com este desafio, tipicamente, concentram suas contribuições no redução e controle do esforço computacional refletido no processo de codificação. Entretanto, a literatura indica uma carência de informações com relação ao consumo de energia despendido pelo processamento da codificação de vídeo e, principalmente, o impacto energético da hierarquia de memória cache neste contexto. Esta tese apresenta uma metodologia para caracterização energética da codificação de vídeo HEVC em processador de propósito geral. O principal objetivo da metodologia proposta nesta tese é fornecer dados quantitativos referentes ao consumo de energia do HEVC. Esta metodologia é composta por dois módulos, um deles voltado para o processamento da codificação HEVC e, o outro, direcionado ao comportamento do padrão HEVC no que diz respeito à memória cache. Uma das principais vantagens deste segundo módulo é manter-se independente de aplicação ou de arquitetura de processador. Neste trabalho, diversas análises foram realizadas visando a caracterização do consumo de energia do codificador HEVC em processador de propósito geral, considerando diferentes sequências de vídeo, resoluções e parâmetros do codificador. Além disso, uma análise extensa e detalhada de diferentes configurações possíveis de memória cache foi realizada com o propósito de avaliar o impacto energético destas configurações na codificação. Os resultados obtidos com a caracterização proposta demonstram que o gerenciamento dos parâmetros da codificação de vídeo, de maneira conjunta com as especificações da memória cache, tem um alto potencial para redução do consumo energético de codificação de vídeo, mantendo bons resultados de qualidade visual das sequências codificadas. / The popularization of high-resolution digital video applications brings several challenges on developing new and efficient techniques to maintain the video compression efficiency. To respond to this demand, the HEVC standard was proposed aiming to duplicate the compression rate when compared to its predecessors. However, to achieve such goal, HEVC imposes a high computational cost and, consequently, energy consumption increase. This scenario becomes even more concerned under battery-powered mobile devices which present computational constraints to process multimedia applications. Most of the related works about encoder realization, typically concentrate their contributions on computational effort reduction and management. Therefore, there is a lack of information regarding energy consumption on video encoders, specially about the energy impact of the cache hierarchy in this context. This thesis presents a methodology for energy characterization of the HEVC video encoder in general purpose processors. The main goal of this methodology is to provide quantitative data regarding the HEVC energy consumption. This methodology is composed of two modules, one focuses on the HEVC processing and the other focuses on the HEVC behavior regarding cache memory-related consumption. One of the main advantages of this second module is to remain independent of application or processor architecture. Several analyzes are performed aiming at the energetic characterization of HEVC coding considering different video sequences, resolutions, and parameters. In addition, an extensive and detailed analysis of different cache configurations is performed in order to evaluate the energy impact of such configurations during the video coding execution. The results obtained with the proposed characterization demonstrate that the management of the video coding parameters in conjunction with the cache specifications has a high potential for reducing the energy consumption of video coding whereas maintaining good coding efficiency results.
180

Segmentação de movimento coerente aplicada à codificação de vídeos baseada em objetos

Silva, Luciano Silva da January 2011 (has links)
A variedade de dispositivos eletrônicos capazes de gravar e reproduzir vídeos digitais vem crescendo rapidamente, aumentando com isso a disponibilidade deste tipo de informação nas mais diferentes plataformas. Com isso, se torna cada vez mais importante o desenvolvimento de formas eficientes de armazenamento, transmissão, e acesso a estes dados. Nesse contexto, a codificação de vídeos tem um papel fundamental ao compactar informação, otimizando o uso de recursos aplicados no armazenamento e na transmissão de vídeos digitais. Não obstante, tarefas que envolvem a análise de vídeos, manipulação e busca baseada em conteúdo também se tornam cada vez mais relevantes, formando uma base para diversas aplicações que exploram a riqueza da informação contida em vídeos digitais. Muitas vezes a solução destes problemas passa pela segmentação de vídeos, que consiste da divisão de um vídeo em regiões que apresentam homogeneidade segundo determinadas características, como por exemplo cor, textura, movimento ou algum aspecto semântico. Nesta tese é proposto um novo método para segmentação de vídeos em objetos constituintes com base na coerência de movimento de regiões. O método de segmentação proposto inicialmente identifica as correspondências entre pontos esparsamente amostrados ao longo de diferentes quadros do vídeo. Logo após, agrupa conjuntos de pontos que apresentam trajetórias semelhantes. Finalmente, uma classificação pixel a pixel é obtida a partir destes grupos de pontos amostrados. O método proposto não assume nenhum modelo de câmera ou de movimento global para a cena e/ou objetos, e possibilita que múltiplos objetos sejam identificados, sem que o número de objetos seja conhecido a priori. Para validar o método de segmentação proposto, foi desenvolvida uma abordagem para a codificação de vídeos baseada em objetos. Segundo esta abordagem, o movimento de um objeto é representado através de transformações afins, enquanto a textura e a forma dos objetos são codificadas simultaneamente, de modo progressivo. O método de codificação de vídeos desenvolvido fornece funcionalidades tais como a transmissão progressiva e a escalabilidade a nível de objeto. Resultados experimentais dos métodos de segmentação e codificação de vídeos desenvolvidos são apresentados, e comparados a outros métodos da literatura. Vídeos codificados segundo o método proposto são comparados em termos de PSNR a vídeos codificados pelo software de referência JM H.264/AVC, versão 16.0, mostrando a que distância o método proposto está do estado da arte em termos de eficiência de codificação, ao mesmo tempo que provê funcionalidades da codificação baseada em objetos. O método de segmentação proposto no presente trabalho resultou em duas publicações, uma nos anais do SIBGRAPI de 2007 e outra no períodico IEEE Transactions on Image Processing. / The variety of electronic devices for digital video recording and playback is growing rapidly, thus increasing the availability of such information in many different platforms. So, the development of efficient ways of storing, transmitting and accessing such data becomes increasingly important. In this context, video coding plays a key role in compressing data, optimizing resource usage for storing and transmitting digital video. Nevertheless, tasks involving video analysis, manipulation and content-based search also become increasingly relevant, forming a basis for several applications that exploit the abundance of information in digital video. Often the solution to these problems makes use of video segmentation, which consists of dividing a video into homogeneous regions according to certain characteristics such as color, texture, motion or some semantic aspect. In this thesis, a new method for segmentation of videos in their constituent objects based on motion coherence of regions is proposed. The proposed segmentation method initially identifies the correspondences of sparsely sampled points along different video frames. Then, it performs clustering of point sets that have similar trajectories. Finally, a pixelwise classification is obtained from these sampled point sets. The proposed method does not assume any camera model or global motion model to the scene and/or objects. Still, it allows the identification of multiple objects, without knowing the number of objects a priori. In order to validate the proposed segmentation method, an object-based video coding approach was developed. According to this approach, the motion of an object is represented by affine transformations, while object texture and shape are simultaneously coded, in a progressive way. The developed video coding method yields functionalities such as progressive transmission and object scalability. Experimental results obtained by the proposed segmentation and coding methods are presented, and compared to other methods from the literature. Videos coded by the proposed method are compared in terms of PSNR to videos coded by the reference software JM H.264/AVC, version 16.0, showing the distance of the proposed method from the sate of the art in terms of coding efficiency, while providing functionalities of object-based video coding. The segmentation method proposed in this work resulted in two publications, one in the proceedings of SIBGRAPI 2007 and another in the journal IEEE Transactions on Image Processing.

Page generated in 0.0824 seconds