Spelling suggestions: "subject:"[een] VIDEO CODING"" "subject:"[enn] VIDEO CODING""
151 |
Techniques d'amélioration des performances de compression dans le cadre du codage vidéo distribué / Techniques for improving the performance of distributed video codingAbou El Ailah, Abdalbassir 14 December 2012 (has links)
Le codage vidéo distribué (DVC) est une technique récemment proposée dans le cadre du codage vidéo, et qui convient surtout à une nouvelle classe d'applications telles que la surveillance vidéo sans fil, les réseaux de capteurs multimédia, et les téléphones mobiles. En DVC, une information adjacente (SI) est estimée au décodeur en se basant sur les trames décodées disponibles, et utilisée pour le décodage et la reconstruction des autres trames. Dans cette thèse, nous proposons de nouvelles techniques qui permettent d'améliorer la qualité de la SI. Tout d'abord, le raffinement itératif de la SI est réalisé après le décodage de chaque sous-bande DCT. Ensuite, une nouvelle méthode de génération de la SI est proposée, qui utilise l'estimation des vecteurs de mouvement dans les deux sens et le raffinement Quad-tree. Ensuite, de nouvelles approches sont proposées afin de combiner les estimations globale et locale en utilisant les différences entre les blocs correspondants et la technique SVM. En plus, des algorithmes sont proposés pour améliorer la fusion au cours du décodage. En outre, les objets segmentés des trames de référence sont utilisés dans la fusion, en utilisant les courbes élastiques et la compensation de mouvement basée-objets. De nombreuses simulations ont été effectuées pour tester les performances des techniques proposés et qui montrent des gains importants par rapport au codeur classique DISCOVER. Par ailleurs, les performances de DVC obtenues en appliquant les algorithmes proposés surpassent celles de H.264/AVC Intra et H.264/AVC No motion pour les séquences testées. En plus, l'écart vis-à-vis de H.264/AVC Inter (IB...IB) est considérablement réduit. / Distributed Video Coding (DVC) is a recently proposed paradigm in video communication, which fits well emerging applications such as wireless video surveillance, multimedia sensor networks, wireless PC camera, and mobile cameras phones. These applications require a low complexity encoding, while possibly affording a high complexity decoding. In DVC, a Side Information (SI) is estimated at the decoder, using the available decoded frames, and used for the decoding and reconstruction of other frames. In this PhD thesis, we propose new techniques in order to improve the quality of the SI. First, successive refinement of the SI is performed after each decoded DCT band. Then, a new scheme for SI generation based on backward, forward motion estimations, and Quad-tree refinement is proposed. Furthermore, new methods for combining global and local motion estimations are proposed, to further improve the SI, using the differences between the corresponding blocks and Support Vector Machine (SVM). In addition, algorithms are proposed to refine the fusion during the decoding process. Furthermore, the foreground objects are used in the combination of the global and local motion estimations, using elastic curves and foreground objects motion compensation. Extensive experiments have been conducted showing that important gains are obtained by the proposed techniques compared to the classical DISCOVER codec. In addition, the performance of DVC applying the proposed algorithms outperforms now the performance of H.264/AVC Intra and H.264/AVC No motion for tested sequences. Besides that, the gap with H.264/AVC in an Inter IB…IB configuration is significantly reduced.
|
152 |
Light-field image and video compression for future immersive applications / Compression d'image et vidéo light-field pour les futures applications immersivesDricot, Antoine 01 March 2017 (has links)
L’évolution des technologies vidéo permet des expériences de plus en plus immersives. Cependant, les technologies 3D actuelles sont encore très limitées et offrent des situations de visualisation qui ne sont ni confortables ni naturelles. La prochaine génération de technologies vidéo immersives apparait donc comme un défi technique majeur, en particulier avec la prometteuse approche light-field (LF). Le light-field représente tous les rayons lumineux dans une scène. De nouveaux dispositifs d’acquisition apparaissent, tels que des ensembles de caméras ou des appareils photo plénoptiques (basés sur des micro-lentilles). Plusieurs sortes de systèmes d’affichage ciblent des applications immersives, comme les visiocasques ou les écrans light-field basés sur la projection, et des applications cibles prometteuses existent déjà (e.g. la vidéo 360°, la réalité virtuelle, etc.). Depuis plusieurs années, le light-field a stimulé l’intérêt de plusieurs entreprises et institutions, par exemple dans des groupes MPEG et JPEG. Les contenus light-feld ont des structures spécifiques et utilisent une quantité massive de données, ce qui représente un défi pour implémenter les futurs services. L'un des buts principaux de notre travail est d'abord de déterminer quelles technologies sont réalistes ou prometteuses. Cette étude est faite sous l'angle de la compression image et vidéo, car l'efficacité de la compression est un facteur clé pour mettre en place ces services light-field sur le marché. On propose ensuite des nouveaux schémas de codage pour augmenter les performances de compression et permettre une transmission efficace des contenus light-field sur les futurs réseaux. / Evolutions in video technologies tend to offer increasingly immersive experiences. However, currently available 3D technologies are still very limited and only provide uncomfortable and unnatural viewing situations to the users. The next generation of immersive video technologies appears therefore as a major technical challenge, particularly with the promising light-field (LF) approach. The light-field represents all the light rays (i.e. in all directions) in a scene. New devices for sampling/capturing the light-field of a scene are emerging fast such as camera arrays or plenoptic cameras based on lenticular arrays. Several kinds of display systems target immersive applications like Head Mounted Display and projection-based light-field display systems, and promising target applications already exist. For several years now this light-field representation has been drawing a lot of interest from many companies and institutions, for example in MPEG and JPEG groups. Light-field contents have specific structures, and use a massive amount of data, that represent a challenge to set up future services. One of the main goals of this work is first to assess which technologies and formats are realistic or promising. The study is done through the scope of image/video compression, as compression efficiency is a key factor for enabling these services on the consumer markets. Secondly, improvements and new coding schemes are proposed to increase compression performance in order to enable efficient light-field content transmission on future networks.
|
153 |
Efficient Support for Application-Specific Video AdaptationHuang, Jie 01 January 2006 (has links)
As video applications become more diverse, video must be adapted in different ways to meet the requirements of different applications when there are insufficient resources. In this dissertation, we address two sorts of requirements that cannot be addressed by existing video adaptation technologies: (i) accommodating large variations in resolution and (ii) collecting video effectively in a multi-hop sensor network. In addition, we also address requirements for implementing video adaptation in a sensor network.
Accommodating large variation in resolution is required by the existence of display devices with widely disparate screen sizes. Existing resolution adaptation technologies usually aim at adapting video between two resolutions. We examine the limitations of these technologies that prevent them from supporting a large number of resolutions efficiently. We propose several hybrid schemes and study their performance. Among these hybrid schemes, Bonneville, a framework that combines multiple encodings with limited scalability, can make good trade-offs when organizing compressed video to support a wide range of resolutions.
Video collection in a sensor network requires adapting video in a multi-hop storeand- forward network and with multiple video sources. This task cannot be supported effectively by existing adaptation technologies, which are designed for real-time streaming applications from a single source over IP-style end-to-end connections. We propose to adapt video in the network instead of at the network edge. We also propose a framework, Steens, to compose adaptation mechanisms on multiple nodes. We design two signaling protocols in Steens to coordinate multiple nodes. Our simulations show that in-network adaptation can use buffer space on intermediate nodes for adaptation and achieve better video quality than conventional network-edge adaptation. Our simulations also show that explicit collaboration among multiple nodes through signaling can improve video quality, waste less bandwidth, and maintain bandwidth-sharing fairness.
The implementation of video adaptation in a sensor network requires system support for programmability, retaskability, and high performance. We propose Cascades, a component-based framework, to provide the required support. A prototype implementation of Steens in this framework shows that the performance overhead is less than 5% compared to a hard-coded C implementation.
|
154 |
Reliability of Pre-Service Teachers Coding of Teaching Videos Using Video-Annotation ToolsDye, Brigham R. 18 July 2007 (has links) (PDF)
Teacher education programs that aspire to helping pre-service teachers develop expertise must help students engage in deliberate practice along dimensions of teaching expertise. However, field teaching experiences often lack the quantity and quality of feedback that is needed to help students engage in meaningful teaching practice. The limited availability of supervising teachers makes it difficult to personally observe and evaluate each student teacher's field teaching performances. Furthermore, when a supervising teacher debriefs such an observation, the supervising teacher and student may struggle to communicate meaningfully about the teaching performance. This is because the student teacher and supervisor often have very different perceptions of the same teaching performance. Video analysis tools show promise for improving the quality of feedback student teachers receive in their teaching performance by providing a common reference for evaluative debriefing and allowing students to generate their own feedback by coding videos of their own teaching. This study investigates the reliability of pre-service teacher coding using a video analysis tool. This study found that students were moderately reliable coders when coding video of an expert teacher (49%-68%). However, when the reliability of student coding of their own teaching videos was audited, students showed a high degree of accuracy (91%). These contrasting findings suggest that coding reliability scores may not be simple indicators of student understanding of the teaching competencies represented by a coding scheme. Instead, reliability scores may also be subject to the influence of extraneous factors. For example, reliability scores in this study were influenced by differences in the technical aspects of how students implemented the coding system. Furthermore, reliability scores were influenced by how coding proficiency was measured. Because this study also suggests that students can be taught to improve their coding reliability, further research may improve reliability scores"-and make them a more valid reflection of student understanding of teaching competency-"by training students about the technical aspects of implementing a coding system.
|
155 |
Exploiting Region Of Interest For Improved Video CodingGopalan, Ramya 28 September 2009 (has links)
No description available.
|
156 |
Design and Implementation of An Emulation Testbed for Video Communications in Ad Hoc NetworksWang, Xiaojun 09 February 2006 (has links)
Video communication is an important application in wireless ad hoc network environment. Although current off-the-shelf video communication software would work for ad hoc network operating under stable conditions (e.g., extremely low link and node failures), video communications for ad hoc network operating under extreme conditions remain a challenging problem. This is because traditional video codec, either single steam or layered video, requires at least one relatively stable path between source and destination nodes.
Recent advances in multiple description (MD) video coding have opened up new possibilities to offer video communications over ad hoc networks. In this thesis, we perform a systematic study on MD video for ad hoc networks. The theoretical foundation of this research is based on an application-centric approach to formulate a cross-layer multipath routing problem that minimizes the application layer video distortion. The solution procedure to this complex optimization problem is based on the so-called Genetic Algorithm (GA). The theoretical results have been documented in [7] and will be reviewed in Chapter 2.
Although the theoretical foundation for MD video over dynamic ad hoc networks has been laid, there remains a lot of skepticisms in the research community on whether such cross-layer optimal routing can be implemented in practice. To fill this gap, this thesis is devoted to the experimental research (or proof-of-concept) for the work in [7]. Our approach is to design and implement an emulation testbed where we can actually implement the ideas and algorithms proposed in [7] in a controlled laboratory setting. The highlights of our experimental research include:
1. A testbed that emulates three properties of a wireless ad hoc network: topology, link success probability, and link bandwidth;
2. A source routing implementation that can easily support comparative study between the proposed GA-based routing with other routing schemes under different network conditions;
3. A modified H.263+ video codec that employs Unequal Error Protection (UEP) approach to generate MD video;
4. Implementation of three experiments that
• compared the GA-based routing with existing technologies (NetMeeting video conferencing plus AODV routing);
• compared our GA-based routing with network-centric routing schemes (two-disjoint paths routing);
• proved that our approach has great potential in supporting video communications in wireless ad hoc networks.
5. Experimental results that show the proposed cross-layer optimization significantly outperforms the current off-the-shelf technologies, and that the proposed cross-layer optimization provides much better performance than network-centric routing schemes in supporting routing of MD video.
In summary, the experimental research in this thesis has demonstrated that a cross-layer multipath routing algorithm can be practically implemented in a dynamic ad hoc network to support video communications. / Master of Science
|
157 |
A hybrid scheme for low-bit rate stereo image compressionJiang, Jianmin, Edirisinghe, E.A. 29 May 2009 (has links)
No / We propose a hybrid scheme to implement an object driven, block based algorithm to achieve low bit-rate compression of stereo image pairs. The algorithm effectively combines the simplicity and adaptability of the existing block based stereo image compression techniques with an edge/contour based object extraction technique to determine appropriate compression strategy for various areas of the right image. Unlike the existing object-based coding such as MPEG-4 developed in the video compression community, the proposed scheme does not require any additional shape coding. Instead, the arbitrary shape is reconstructed by the matching object inside the left frame, which has been encoded by standard JPEG algorithm and hence made available at the decoding end for those shapes in right frames. Yet the shape reconstruction for right objects incurs no distortion due to the unique correlation between left and right frames inside stereo image pairs and the nature of the proposed hybrid scheme. Extensive experiments carried out support that significant improvements of up to 20% in compression ratios are achieved by the proposed algorithm in comparison with the existing block-based technique, while the reconstructed image quality is maintained at a competitive level in terms of both PSNR values and visual inspections
|
158 |
Fast-forward functions on parallel video serversDing, Zhiyong 01 January 1999 (has links)
No description available.
|
159 |
Objective video quality analysis of MPEG-1, MPEG-2, and Windows Media video formatsAeluri, Praveen 01 July 2003 (has links)
No description available.
|
160 |
Advancing Learned Lossy Image Compression through Knowledge Distillation and Contextual ClusteringYichi Zhang (19960344) 29 October 2024 (has links)
<p dir="ltr">In recent decades, the rapid growth of internet traffic, particularly driven by high-definition images/videos has highlighted the critical need for effective image compression to reduce bit rates and enable efficient data transmission. Learned lossy image compression (LIC), which uses end-to-end deep neural networks, has emerged as a highly promising method, even outperforming traditional methods such as the intra-coding of the versatile video coding (VVC) standard. This thesis contributes to the field of LIC in two ways. First, we present a theoretical bound-guided knowledge distillation technique, which utilizes estimated bound information rate-distortion (R-D) functions to guide the training of LIC models. Implemented with a modified hierarchical variational autoencoder (VAE), this method demonstrates superior rate-distortion performance with reduced computational complexity. Next, we introduce a token mixer neural architecture, referred to as <i>contextual clustering</i>, which serves as an alternative to conventional convolutional layers or self-attention mechanisms in transformer architectures. Contextual clustering groups pixels based on their cosine similarity and uses linear layers to aggregate features within each cluster. By integrating with current LIC methods, we not only improve coding performance but also reduce computational load. </p>
|
Page generated in 0.2304 seconds