Global ETD Search

161	Rápida predição da direção do bloco para aplicação com transformadas direcionais / Fast block direction prediction for directional transforms Beltrão, Gabriel Tedgue 12 May 2012 (has links) Orientadores: Yuzo Iano, Rangel Arthur / Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de Computação / Made available in DSpace on 2018-08-21T22:39:06Z (GMT). No. of bitstreams: 1 Beltrao_GabrielTedgue_M.pdf: 7074938 bytes, checksum: 0a2d464733f2fb5dcc14430cc1844758 (MD5) Previous issue date: 2012 / Resumo: As transformadas derivadas da DCT são amplamente utilizadas para compressão de vídeo. Recentemente, muitos autores têm destacado que os resíduos de predição normalmente apresentam estruturas direcionais que não podem ser eficientemente representadas pela DCT convencional. Nesse contexto, muitas transformadas direcionais têm sido propostas como forma de suplantar a deficiência da DCT em lidar com tais estruturas. Apesar do desempenho superior das transformadas direcionais sobre a DCT convencional, para a sua aplicação na compressão de vídeo é necessário avaliar o aumento no tempo de codificação e a complexidade para sua implementação. Este trabalho propõe um rápido algoritmo para se estimar as direções existentes em um bloco antes da aplicação das transformadas direcionais. O codificador identifica as direções predominantes em cada bloco e aplica apenas a transformada referente àquela direção. O algoritmo pode ser usado em conjunto com qualquer proposta de transformadas direcionais que utilize a técnica de otimização por taxa-distorção (RDO) para a seleção da direção a ser explorada, reduzindo a complexidade de implementação a níveis similares a quando apenas a DCT convencional é utilizada / Abstract: DCT-based transforms are widely adopted for video compression. Recently, many authors have highlighted that prediction residuals usually have directional structures that cannot be efficiently represented by conventional DCT. In this context, many directional transforms have been proposed as a way to overcome DCT's deficiency in dealing with such structures. Although directional transforms have superior performance over the conventional DCT, for application in video compression it is necessary to evaluate increase in coding time and complexity for its implementation. This work proposes a fast algorithm for estimating blocks directions before applying directional transforms. The encoder identifies predominant directions in each block, and only applies the transform referent to that direction. The algorithm can be used in conjunction with any proposed algorithm for directional transforms that uses the rate-distortion optimization (RDO) process for selection of the direction to be explored; reducing implementation complexity to similar levels when only conventional DCT is used / Mestrado / Telecomunicações e Telemática / Mestre em Engenharia Elétrica Compressão de imagens Processamento de sinais Televisão digital Teoria da codificação Image compression Signal processing MPEG (Video coding standard) Digital television Coding theory
162	Error Detection for DMB Video Streams Irani, Ramin January 2013 (has links) The purpose of this thesis is to detect errors in Digital Multimedia Broadcasting (DMB) transport stream. DMB uses the MPEG-4 standard for encapsulating Packetized Elementary Stream (PES), and uses the MPEG-2 standard for assembling them in the form of transport stream packets. Recently many research works have been carried out about video stream error detection. They mostly do this by focusing on some decoding parameters related to frame. Processing complexity can be a disadvantage for the proposed methods. In this thesis, we investigated syntax error occurrences due to corruption in the header of the video transport stream. The main focus of the study is the video streams that cannot be decoded. The proposed model is implemented by filtering video and audio packets in order to find the errors. The filters investigate some sources that can affect the video stream playback. The output from this method determines the type, location and duration of the errors. The simplicity of the structure is one of advantages of this model. It can be implemented by three simple filters for detecting errors and a “calculation unit” for calculating the duration of an error. Fast processing is another benefit of the proposed model. Digital Multimedia Broadcasting MPEG-2 standard MPEG-4 standard video transport stream Signal Processing Signalbehandling Computer Sciences Datavetenskap (datalogi) Telecommunications Telekommunikation
163	Dispositif de rendu distant multimédia et sémantique pour terminaux légers collaboratifs / Semantic multimedia remote viewer for collaborative mobile thin clients Joveski, Bojan 18 December 2012 (has links) Développer un système de rendu distant pour terminaux légers et mobiles traitant d'objets multimédias et de leur sémantique consiste à (1) offrir une véritable expérience multimédia collaborative au niveau du terminal, (2) assurer la compatibilité avec les contraintes liées au réseau (bande passante, erreurs et latence variables en temps) et au terminal (ressources de calcul et de mémoire réduites) et (3) s'affranchir des types de terminaux et des spécificités des communautés.Cette thèse traite de ces enjeux et se positionne en rupture avec l'état de l'art en développant une architecture support fondée sur la gestion sémantique du contenu multimédia. Le principe consiste à convertir en temps réel le contenu graphique généré par l'application en un graphe de scène multimédia et à le gérer en fonction de la sémantique de ses composantes.L'optimisation de la bande passante est assurée par la compression adaptative du graphe de scène et par la compression sans perte des messages de collaboration. Les deux méthodes développées sont caractérisées respectivement par la création d'un unique graphe de scène intrinsèquement adaptable au réseau/terminal et par la mise à jour dynamique du dictionnaire de codage en fonction des messages générés par les utilisateurs. Elles sont brevetées.Les fonctionnalités collaboratives interviennent directement au niveau du contenu grâce à l'enrichissement du graphe de scène par un nouveau type de nœud, dont la normalisation ISO est en cours.Le démonstrateur logiciel sous-jacent, dénommé MASC (Multimedia Adaptive Semantic Collaboration), permet de comparer objectivement cette nouvelle architecture aux solutions actuellement déployées par des acteurs majeurs du domaine (VNC RBF ou Microsoft RDP). Deux types d'application ont été considérés : l'édition du texte et la navigation sur Internet. Les évaluations quantitatives montrent: (1) un impact limité des artéfacts visuels de conversion (PSNR compris entre 30 et 42 dB et SSIM supérieur à 0,9999), (2) consommation de la bande passante downlink (resp. uplink) réduite d'un facteur de 2 à 60 (resp. de 3 à 10), (3) latence dans la transmission des événements générés par l'utilisateur réduite d'un facteur de 4 à 6, (4) consommation des ressources de calcul côté client réduite d'un facteur 1,5 par rapport à VNC RFB. / Defining a multimedia remote viewer for mobile thin clients comes across with threefold scientific/technical constraints: (1) providing at the client side heterogeneous multimedia content and the support for ultimate collaboration functionalities, (2) ensuring a stable quality of services despite constrained resources available for the network and the terminal, and (3) featuring terminal independency and benefiting from community support.The present thesis addresses these challenges by developing a collaborative, semantic multimedia remote viewer. The underlying architecture features novel components for scene-graph creation and management, as well as for collaborative user events handling.The adaptive compression of the multimedia scene graph and the lossless compression of the collaborative messages are optimized through two devoted algorithms. The former creates a unique scene-graph, intrinsically adaptable to the network/terminal conditions. The latter dynamically generates and updates the encoding table according to the messages generated by the collaborative users. Both algorithms are patented.The direct collaborative functionality is ensured at the content level by enriching the scene graph with a new type of node where currently becomes a part of the ISO standards.The experimental setup considers the Linux X windows system and BiFS/LASeR multimedia scene technologies on the server and client sides, respectively. The implemented solution was objectively benchmarked against currently deployed solutions (VNC RFB and Microsoft RDP), by considering text-editing and www-browsing applications. The quantitative assessments demonstrate: (1) limited depreciation of the visual quality, e.g. PSNR values between 30 and 42dB or SSIM values larger than 0.9999; (2) downlink bandwidth gain factors ranging from 2 to 60; (3) efficient real-time user event management expressed by network roundtrip-time reduction by factors of 4 to 6 and by up-link bandwidth gain factors from 3 to 10; (4) feasible CPU activity, larger than in the Microsoft RDP case but reduced by a factor of 1.5 with respect to the VNC RFB. Rendu distant Multimédia semantique Terminaux légers collaboratif MPEG-4 BiFS et LASeR Remote viewer Semantic multimedia Collaborative thin client MPEG-4 BiFS and LASeR
164	Multi-View Video Transmission over the Internet Abdullah Jan, Mirza, Ahsan, Mahmododfateh January 2010 (has links) 3D television using multiple views rendering is receiving increasing interest. In this technology a number of video sequences are transmitted simultaneously and provides a larger view of the scene or stereoscopic viewing experience. With two views stereoscopic rendition is possible. Nowadays 3D displays are available that are capable of displaying several views simultaneously and the user is able to see different views by moving his head. The thesis work aims at implementing a demonstration system with a number of simultaneous views. The system will include two cameras, computers at both the transmitting and receiving end and a multi-view display. Besides setting up the hardware, the main task is to implement software so that the transmission can be done over an IP-network. This thesis report includes an overview and experiences of similar published systems, the implementation of real time video, its compression, encoding, and transmission over the internet with the help of socket programming and finally the multi-view display in 3D format. This report also describes the design considerations more precisely regarding the video coding and network protocols. Multi-view Coding Stereoscopic H.264/AVC Video Compression Moving Picture Experts Group (MPEG) MPEG-4 3DTV Sockets Streaming Server Information Systems
165	Compression multimodale du signal et de l’image en utilisant un seul codeur / Multimodal compression of digital signal and image data using a unique encoder Zeybek, Emre 24 March 2011 (has links) Cette thèse a pour objectif d'étudier et d'analyser une nouvelle stratégie de compression, dont le principe consiste à compresser conjointement des données issues de plusieurs modalités, en utilisant un codeur unique. Cette approche est appelée « Compression Multimodale ». Dans ce contexte, une image et un signal audio peuvent être compressés conjointement et uniquement par un codeur d'image (e.g. un standard), sans la nécessité d'intégrer un codec audio. L'idée de base développée dans cette thèse consiste à insérer les échantillons d'un signal en remplacement de certains pixels de l'image « porteuse » tout en préservant la qualité de l'information après le processus de codage et de décodage. Cette technique ne doit pas être confondue aux techniques de tatouage ou de stéganographie puisqu'il ne s'agit pas de dissimuler une information dans une autre. En Compression Multimodale, l'objectif majeur est, d'une part, l'amélioration des performances de la compression en termes de débit-distorsion et d'autre part, l'optimisation de l'utilisation des ressources matérielles d'un système embarqué donné (e.g. accélération du temps d'encodage/décodage). Tout au long de ce rapport, nous allons étudier et analyser des variantes de la Compression Multimodale dont le noyau consiste à élaborer des fonctions de mélange et de séparation, en amont du codage et de séparation. Une validation est effectuée sur des images et des signaux usuels ainsi que sur des données spécifiques telles que les images et signaux biomédicaux. Ce travail sera conclu par une extension vers la vidéo de la stratégie de la Compression Multimodale / The objective of this thesis is to study and analyze a new compression strategy, whose principle is to compress the data together from multiple modalities by using a single encoder. This approach is called “Multimodal Compression” during which, an image and an audio signal is compressed together by a single image encoder (e.g. a standard), without the need for an integrating audio codec. The basic idea developed in this thesis is to insert samples of a signal by replacing some pixels of the "carrier's image” while preserving the quality of information after the process of encoding and decoding. This technique should not be confused with techniques like watermarking or stéganographie, since Multimodal Compression does not conceal any information with another. Two main objectives of Multimodal Compression are to improve the compression performance in terms of rate-distortion and to optimize the use of material resources of a given embedded system (e.g. acceleration of encoding/decoding time). In this report we study and analyze the variations of Multimodal Compression whose core function is to develop mixing and separation prior to coding and separation. Images and common signals as well as specific data such as biomedical images and signals are validated. This work is concluded by discussing the video of the strategy of Multimodal Compression Compression multimodale Jpeg 2000 Décomposition d‟ondelette Interpolation par des splines Quadtree Mpeg Multimodal compression Jpeg 2000 Wavelet decomposition Spline interpolation Quadtree Mpeg
166	Improving quality of experience in multimedia streaming by leveraging Information-Centric Networking / Améliorer la qualité d'expérience du streaming multimédia en tirant parti des réseaux centrés sur l'information Samain, Jacques 19 March 2019 (has links) Les réseaux centrés sur l’information (ICN) sont une architecture prometteuse pour faire face à l’explosion du trafic multimédia sur internet et à la mobilité croissante des utilisateurs: non seulement ICN peut améliorer la qualité d’expérience de l’utilisateur, mais ICN peut également étendre naturelle et de façon transparente la prise en charge du trafic vidéo dans les fonctions réseau. Cependant, à notre connaissance, une évaluation approfondie des avantages apportés par ICN à la diffusion multimédia n’a pas encore été réalisée. Dans cette thèse, nous voulons réduire l’écart qui nous sépare d’une telle évaluation en prenant en compte ICN dans divers scénarios de diffusion multimédia.Tout d’abord, nous évaluons les avantages apportés par du DAS (Dynamic Adaptive Streaming) basé sur ICN par rapport au streaming basé sur TCP/IP, au moyen d’une campagne expérimentale comprenant plusieurs canaux (des émulations Wi-Fi et LTE, des traçes 3G/4G), plusieurs clients (mélange homogène et hétérogène, arrivées synchrones et asynchrones) et des logiques d’adaptation DAS soigneusement sélectionnées pour couvrir les deux grandes familles d’algorithmes disponibles. Nous mettons aussi enexergue les pièges potentiels qui sont néanmoins facilement évitables.Ensuite, nous montrons comment l’assistance du réseau contribue à améliorer la qualité d’expérience des utilisateurs. Pour ce faire, nous tirons parti de la fonctionnalité de mise en cache réseau d’ICN et proposons un signal re ́seau simple envoyé périodiquement par le cache à exploiter par l’algorithme d’adaptation DAS pour optimiser la qualité d’expérience de l’utilisateur en évitant le phénomène bien connu des oscillations induites par le cache. Des expériences nous permettent de valider le bien-fondé de notre approche.Enfin, puisque la diffusion multimedia en direct gagne du terrain, nous proposons hICN-RTC, en intégrant hICN (hybrid ICN), une solution ICN-dans-IP, à WebRTC, accompagné du protocole RICTP (Realtime Information Centric Transport Protocol), un protocole de transport basé sur le contenu, qui minimise la latence. Bien que toujours en développement, les résultats des premières expériences sont prometteurs car ils montrent que le trafic induit par hICN-RTC ne croit qu’avec le nombre de locuteurs actifs plutôt qu’avec le nombre total de participants. / Information-Centric Networking (ICN) is a promising architecture to address today Internet multimedia traffic explosion and increasing user mobility: not only to enhance the user’s quality of experience, but also to naturally and seamlessly extend video sup- port deeper in the network functions. However, to the best of our knowledge, a thorough assessment of the benefits brought by ICN to multimedia delivery has not been done yet. In this thesis, we aim at reducing the gap to such assessment, by considering ICN in various multimedia delivery scenarios.First, we assess the benefits brought by an ICN-based Dynamic Adaptive Streaming (DAS) compared to TCP/IP based streaming, by means of an experimental campaign that includes multiple channels (e.g., emulated Wi-Fi and LTE, real 3G/4G traces), multiple clients (homogeneous vs heterogeneous mixture, synchronous vs asynchronous arrivals) and carefully selected DAS adaptation logics to cover the broad families of available adaptation algorithms. We also warn about potential pitfalls that are nonethelesseasily avoidable.Second, we show how network assistance helps im- proving the users’ quality of experience. To do so, we leverage the in-network caching feature of ICN and propose a simple periodical network signal from the cache (i.e., per-quality hit ratio) to be exploited by DAS adaptation logic to enhance further the user’s quality of experience by avoiding the known cache-induced quality oscillations. We confirm the soundness of our approach through experiments.Finally, as live multimedia delivery is gaining momentum, we propose hICN-RTC by integrating hICN (hybrid ICN), an ICN-in-IP solution, to WebRTC and we design RICTP (Realtime Information Centric Trans- port Protocol), a content-aware transport that minimizes the communication latency. Although still in development, the results we gathered from early experiments are promising as they show that hICN-RTC scales with the number of active speakers rather than the total number of participants. Réseau centré sur l'information, ICN Streaming multimédia Qualité d'expérience, QoE Mpeg-Dash WebRTC Information-centric networking, ICN Multimedia streaming Quality of experience, QoE Mpeg-Dash WebRTC
167	Simulace zkreslení zvukového signálu v percepčním zvukovém kodéru / Simulation of Audio Signal Distortion in Perceptual Audio Encoder Peloušek, Tomáš January 2021 (has links) This thesis deals with the issue of the creation of a programme that would simulate the distortion that appears during the process of lossy audio coding. As the environment for the creation, the MATLAB programming language has been chosen. An encoder, which changes the subjective signal quality according to customer preferences for the bitrate, has been created as a practical part of this thesis. Its function is based on a dynamic bit allocation technique and includes an optional window switching algorithm. The theoretical background for the creation of the programme consists of an explanation of the main principles of lossy coding with emphasis on MPEG1 layer 3 operating principles. The practical chapter describes how the created programme and its parts work, and it includes results of the run quality testing. The testing was conducted using the objective assessment method PEMO-Q, and consisted of comparing the objective quality of the programme’s outputs to the quality of samples on which a regular MP3 encoder with identical settings was used.
168	Vyuit maskovacch efekt pro vodoznaÄen audio dat / Using masking effects for audio data watermarking Kabourek, Ji January 2008 (has links) In this work is presented technique for embedding digital watermark in digital audio signals. Digital watermark must be imperceptible and should be robust against attacks and other types of distortion. Algorithm is implemented for embedding digital watermark using technique spread-spectrum and psychoacoustic model ISO-MPEG I layer I. Robustness was tested for filtering signal, MP3 compression and resample method.
169	MPEG-4 AVC stream watermarking / Tatouage du flux compressé MPEG-4 AVC Hasnaoui, Marwen 28 March 2014 (has links) La présente thèse aborde le sujet de tatouage du flux MPEG-4 AVC sur ses deux volets théoriques et applicatifs en considérant deux domaines applicatifs à savoir la protection du droit d’auteur et la vérification de l'intégrité du contenu. Du point de vue théorique, le principal enjeu est de développer un cadre de tatouage unitaire en mesure de servir les deux applications mentionnées ci-dessus. Du point de vue méthodologique, le défi consiste à instancier ce cadre théorique pour servir les applications visées. La première contribution principale consiste à définir un cadre théorique pour le tatouage multi symboles à base de modulation d’index de quantification (m-QIM). La règle d’insertion QIM a été généralisée du cas binaire au cas multi-symboles et la règle de détection optimale (minimisant la probabilité d’erreur à la détection en condition du bruit blanc, additif et gaussien) a été établie. Il est ainsi démontré que la quantité d’information insérée peut être augmentée par un facteur de log2m tout en gardant les mêmes contraintes de robustesse et de transparence. Une quantité d’information de 150 bits par minutes, soit environ 20 fois plus grande que la limite imposée par la norme DCI est obtenue. La deuxième contribution consiste à spécifier une opération de prétraitement qui permet d’éliminer les impactes du phénomène du drift (propagation de la distorsion) dans le flux compressé MPEG-4 AVC. D’abord, le problème a été formalisé algébriquement en se basant sur les expressions analytiques des opérations d’encodage. Ensuite, le problème a été résolu sous la contrainte de prévention du drift. Une amélioration de la transparence avec des gains de 2 dB en PSNR est obtenue / The present thesis addresses the MPEG-4 AVC stream watermarking and considers two theoretical and applicative challenges, namely ownership protection and content integrity verification.From the theoretical point of view, the thesis main challenge is to develop a unitary watermarking framework (insertion/detection) able to serve the two above mentioned applications in the compressed domain. From the methodological point of view, the challenge is to instantiate this theoretical framework for serving the targeted applications. The thesis first main contribution consists in building the theoretical framework for the multi symbol watermarking based on quantization index modulation (m-QIM). The insertion rule is analytically designed by extending the binary QIM rule. The detection rule is optimized so as to ensure minimal probability of error under additive white Gaussian noise distributed attacks. It is thus demonstrated that the data payload can be increased by a factor of log2m, for prescribed transparency and additive Gaussian noise power. A data payload of 150 bits per minute, i.e. about 20 times larger than the limit imposed by the DCI standard, is obtained. The thesis second main theoretical contribution consists in specifying a preprocessing MPEG-4 AVC shaping operation which can eliminate the intra-frame drift effect. The drift represents the distortion spread in the compressed stream related to the MPEG encoding paradigm. In this respect, the drift distortion propagation problem in MPEG-4 AVC is algebraically expressed and the corresponding equations system is solved under drift-free constraints. The drift-free shaping results in gain in transparency of 2 dB in PSNR Tatouage de flux compressé MPEG-4 AVC M-QIM Protection du droit d'auteur Intégrité du contenu Drift-free Stream watermarking MPEG-4 AVC M-QIM Ownership protection Integrity verification Drift-free
170	Enriched in-band video : from theoretical modeling to new services for the society of knowledge / In-band enriched video : de la modélisation théorique aux nouveaux services pour la société des connaissances Belhaj Abdallah, Maher 05 December 2011 (has links) Cette thèse a pour ambition d’explorer d’un point de vue théorique et applicatif le paradigme de l’in-band enrichment. Emergence de la société des connaissances, le concept de média enrichi renvoie à toute association de métadonnée (textuelle, audiovisuelle, code exécutable) avec un média d’origine. Un tel principe peut être déployé dans une large variété d’applications comme la TVNi - Télévision Numérique interactive, les jeux ou la fouille des données. Le concept de l’inband enrichement conçu et développé par M. Mitrea et son équipe au Département ARTEMIS de Télécom SudParis, suppose que les données d’enrichissement sont insérées dans le contenu même à enrichir. Ainsi, un tel concept peut-il tirer parti de techniques de tatouage, dès lors que celles-ci démontrent qu’elles ont la capacité d’insérer la quantité d’information requise par ce nouveau type d’application : i.e. 10 à 1000 fois plus grande que celle nécessaire pour les enjeux d’authentification ou de protection de droit d’auteur. Si par tradition la marque est insérée dans le domaine non compressé, les contraintes relatives aux nombreuses applications émergentes (comme la VoD – Vidéo à la Demande ou la TVNi) font du tatouage en temps réel dans le domaine compressé un important défi théorique et applicatif. Cependant, le tatouage dans le domaine compressé est une alliance de mots contradictoires puisque la compression (élimination de la redondance) rend l’hôte plus sensible aux modifications et l’association hôte/marque, plus fragile / The present thesis, developed at Institut Télécom Télécom SudParis under the “Futur et Rupture” framework, takes the challenge of exploring from both theoretical and applicative points of views the in band enrichment paradigm. Emerged with the knowledge society, the enriched media refers to any type of association which may be established between some metadata (textual, audio, video, exe codes...) and a given original media. Such a concept is currently deployed in a large variety of applications like the iDTV (interactive Digital TV), games, data mining... The incremental notion of in band enrichment advanced at the ARTEMIS Department assumes that the enrichment data are directly inserted into the very original media to be enriched. In real life, in band enrichment can be supported by the watermarking technologies, assuming they afford a very large data payload, i.e. 10 to 1000 larger than the traditional copyright applications. The nowadays advent of the ubiquous media computing and storage applications imposes an additional constraint on the watermarking techniques: the enrichment data should be inserted into some compressed original media. A priori, such a requirement is a contradiction in terms, as compression eliminates the visual redundancy while the watermarking exploits the visual redundancy in order to imperceptibly insert the mark Vidéo enrichie Tatouage numérique H.264 MPEG-4 AVC Masquage perceptuel Stirmark QIM DVQ Enriched video Watermarking H.264 MPEG-4 AVC Perceptual masking Stirmark QIM DVQ

Search results