Global ETD Search

51	Arquitetura de um decodificador de áudio para o Sistema Brasileiro de Televisão Digital e sua implementação em FPGA Renner, Adriano January 2011 (has links) O Sistema Brasileiro de Televisão Digital estabeleceu como padrão de codificação de áudio o algoritmo MPEG-4 Advanced Audio Coding, mais precisamente nos perfis Low Complexity, High Efficiency versão 1 e High Efficiency versão 2. O trabalho apresenta um estudo detalhado sobre o padrão, contendo desde alguns conceitos da psicoacústica como o mascaramento até a metodologia de decodificação do stream codificado, sempre voltado para o mercado do SBTVD. É proposta uma arquitetura em hardware para um decodificador compatível com o padrão MPEG-4 AAC LC. O decodificador é separado em dois grandes blocos mantendo em um deles o banco de filtros, considerado a parte mais custosa em termos de processamento. No bloco restante é realizada a decodificação do espectro, onde ocorre a decodificação dos códigos de Huffman, o segundo ponto crítico do algoritmo em termos de demandas computacionais. Por fim é descrita a implementação da arquitetura proposta em VHDL para prototipação em um FPGA da família Cyclone II da Altera. / MPEG-4 Advanced Audio Coding is the chosen algorithm for the Brazilian Digital Television System (SBTVD), supporting the Low Complexity, High Efficiency version 1 and High Efficiency version 2 profiles. A detailed study of the algorithm is presented, ranging from psychoacoustics concepts like masking to a review of the AAC bitstream decoding process, always keeping in mind the SBTVD. A digital hardware architecture is proposed, in which the algorithm is split in two separate blocks, one of them containing the Filter Bank, considered the most demanding task. The other block is responsible for decoding the coded spectrum, which contains the second most demanding task of the system: the Huffman decoding. In the final part of this work the conversion of the proposed architecture into VHDL modules meant to be prototyped with an Altera Cyclone II FPGA is described. Microeletrônica Áudio digital Fpga MPEG-4 AAC SBTVD Digital audio AAC LC Audio decoding FPGA
52	Projeto de um modelador 3D colaborativo baseado no padrão emergente MPEG-4 MU. Duarte, Fernando Vieira 19 August 2003 (has links) Made available in DSpace on 2016-06-02T19:05:17Z (GMT). No. of bitstreams: 1 DissFVD.pdf: 2351672 bytes, checksum: caf5290232d2d3e84071c13c80625d48 (MD5) Previous issue date: 2003-08-19 / In the last years we have been experiencing advances in networking technologies as well as computer graphics and display technology. As a consequence, a lot of Collaborative Virtual Environments have emerged thanks to the increasing computation capabilities of desktop computers as well as the enormous growth in network bandwidth and the ubiquity of the Internet. Collaborative virtual environments for 3D modeling can be characterized by the interaction among multiple users for the creation and/or modification of shared 3D objects. These environments can be used, for instance, in the modeling and visualization of virtual prototypes in order to reduce costs in the process of products design. The challenges of building collaborative 3D modeling environments are mainly related to real-time rendering of the modified objects, as well as to users interaction with the virtual environment, and to consistency maintenance of the shared virtual environment. Only a few collaborative 3D modeling environments are found in the literature and usually their projects result in complex non standardized solutions for the collaboration among users. This work presents the implementation of a collaborative environment for 3D modeling based on the emerging MPEG-4MU (multi-user) standard. With this environment, 3D graphic scenes can be created in real-time, by multiple participants, in synchronous collaborative sessions. These scenes can be visualized in any MPEG-4 terminal, including cellulars and personal digital assistants (PDAs). Sessions control, consistency maintenance, concurrency control and 3D object locking are realized by the MSC (MUTech Session Controller) and MBK (MUTech Bookkeeper) components, through the Pilot/drone mechanism and the BIFS-Command protocol. All these components were defined by the MPEG-4 MU emerging standard and implemented by the Networked Virtual Reality Lab (LRVNet) at Federal University of São Carlos. / Nos últimos anos temos experimentado avanços nas tecnologias de rede como também na computação gráfica e tecnologia de exibição. Como uma conseqüência, muitos Ambientes Virtuais Colaborativos emergiram graças às capacidades crescentes da computação dos computadores de mesa (desktop), como também o crescimento enorme na largura de banda da rede e a onipresença da Internet. Ambientes virtuais colaborativos para modelagem 3D podem ser caracterizados pela interação entre múltiplos usuários para a modificação e/ou criação de objetos 3D compartilhados. Por exemplo, estes ambientes podem ser usados na modelagem e visualização de protótipos virtuais para reduzir custos no processo de projeto de produtos. Os desafios da construção de ambientes de modelagem 3D colaborativos estão relacionados principalmente com a renderização em tempo-real dos objetos modificados, como também para a interação de usuários com o ambiente virtual e a manutenção da consistência do ambiente virtual compartilhado. São encontrados na literatura somente alguns ambientes de modelagem 3D colaborativos e normalmente os projetos resultam em soluções não padronizadas e complexas para a colaboração entre os usuários. Este trabalho apresenta a implementação de um ambiente colaborativo para modelagem 3D baseado no padrão emergente MPEG-4 MU (MultiUser). Com este ambiente, cenas gráficas 3D podem ser criadas em tempo-real, por participantes múltiplos, em sessões colaborativas síncronas. Estas cenas podem ser visualizadas em qualquer terminal MPEG-4, inclusive celulares e PDAs (Personal Digital Assistants). Controle de Sessão, manutenção de consistência, controle da concorrência e trancamento (locking) dos objetos 3D são realizados pelos componentes MSC (MUTech Session Controller) e MBK (MUTech Bookkeeper), pelo mecanismo de Pilot/Drone e o protocolo de BIFS-Command. Todos estes componentes são definidos pelo padrão emergente MPEG-4 MU e implementados pelo Laboratório de Realidade Virtual em Rede (LRVNet) da Universidade Federal de São Carlos. Realidade virtual Ambientes virtuais colaborativos Modelagem 3D MPEG-4
53	Uma estrutura de suporte para adaptação em jogos 3D multiusuário Silva, Alessandro Rodrigues e 26 August 2003 (has links) Made available in DSpace on 2016-06-02T19:05:33Z (GMT). No. of bitstreams: 1 2468.pdf: 1070073 bytes, checksum: 2534ccfe53038c0363829026d8a20941 (MD5) Previous issue date: 2003-08-26 / With the growing dissemination and reliability of wireless networks and the emergence of devices with more and more processing and communication power, applications up to now restricted to PCs are being envisaged to run on devices as heterogeneous as wrist clocks with GPS locators, refrigerators with internet access, up to mobile phones, PDAs, settop- boxes and game consoles. The integration of this myriad of devices and network technologies, with different capabilities, demand special attention from the software programmers and designers specially when these applications are shared among multiple users from different devices, each one producing and consuming information according to each device capacity. In this sense, application adaptation, which allows a software to react upon the resources variations used, is an important process towards fitting the application to a certain device configuration. This adaptation is being considered as an important part of the systems that act upon heterogeneous processing and communication environment. Therefore, a large amount of work has been done in the traditional multimedia adaptation, such as text, images, audio and video, less attention has been focused on 3D digital content first because of the complexity involved in the 3D applications adaptation issues, and also because true marketing opportunities for 3D graphics in heterogeneous devices have just began to emerge. Moreover, initiatives that promote adaptation standardization must be considered, so that interoperability among adaptation mechanisms can become a reality. This work investigated the MPEG-4 standard and its extension MPEG-J to handle adaptation issues, and proposes an adaptation framework for 3D multi-user game applications, entitle MMGAME. With this framework, elements composing a game application can be adapted according to devices resources variations where the application is running as well as in the network where is connected. / Com a maior disseminação e confiabilidade das redes de comunicação sem fio, e a emergência de dispositivos com o potencial de processamento e de comunicação cada vez maior, aplicações antes restritas aos computadores pessoais estão sendo vislumbradas em dispositivos tão heterogêneos quanto relógios com localizadores, geladeiras que acessam a Internet, celulares, PDAs (computador digital pessoal), set-top-boxes e consoles de jogos. A integração dessa miríade de dispositivos e de tecnologias de redes, com capabilidades variadas, exige, hoje, dos programadores e projetistas de software, atenção especial na construção de aplicações - especialmente quando essas aplicações são compartilhadas por usuários em diferentes dispositivos, cada um produzindo e consumindo informações de acordo com a capacidade do dispositivo. Neste sentido, a adaptação da aplicação, que permite a um software reagir a variações nos recursos utilizados por ele, permitindo melhor adequação de suas funções e dados para uma determinada configuração, é um processo importante e está sendo, cada vez mais, considerada como parte importante de sistemas que atuam em ambientes heterogêneos de computação e de rede. Uma grande quantidade de trabalhos tem sido devotada à adaptação da entrega de formatos tradicionais de multimídia, como texto, imagens, áudio e vídeo, entretanto, pouca atenção tem sido dada ao conteúdo digital 3D, primeiro por causa da complexidade das questões envolvidas na adaptação de aplicações 3D, maior do que as de outras mídias, e também pelo fato que oportunidades verdadeiras de marketing para conteúdo 3D estão apenas começando a surgir. Mais ainda, iniciativas que promovem a padronização da adaptação devem ser consideradas, para que seja promovida a interoperabilidade em futuros mecanismos de adaptação. Este trabalho investigou o padrão MPEG-4 e sua extensão MPEG-J que tratam, de forma superficial ainda, da questão da adaptação, e propôs um framework de adaptação de aplicações 3D, denominado MMGAME, mais especificamente de jogos multiusuário em que, elementos que compõem uma aplicação de jogos possam ser adaptados em função de flutuações nos recursos do dispositivo que executa a aplicação e da rede onde este dispositivo está conectado. Ambientes virtuais colaborativos Ambientes 3D Adaptação de conteúdo MPEG-4 Multiusuário
54	Arquitetura de um decodificador de áudio para o Sistema Brasileiro de Televisão Digital e sua implementação em FPGA Renner, Adriano January 2011 (has links) O Sistema Brasileiro de Televisão Digital estabeleceu como padrão de codificação de áudio o algoritmo MPEG-4 Advanced Audio Coding, mais precisamente nos perfis Low Complexity, High Efficiency versão 1 e High Efficiency versão 2. O trabalho apresenta um estudo detalhado sobre o padrão, contendo desde alguns conceitos da psicoacústica como o mascaramento até a metodologia de decodificação do stream codificado, sempre voltado para o mercado do SBTVD. É proposta uma arquitetura em hardware para um decodificador compatível com o padrão MPEG-4 AAC LC. O decodificador é separado em dois grandes blocos mantendo em um deles o banco de filtros, considerado a parte mais custosa em termos de processamento. No bloco restante é realizada a decodificação do espectro, onde ocorre a decodificação dos códigos de Huffman, o segundo ponto crítico do algoritmo em termos de demandas computacionais. Por fim é descrita a implementação da arquitetura proposta em VHDL para prototipação em um FPGA da família Cyclone II da Altera. / MPEG-4 Advanced Audio Coding is the chosen algorithm for the Brazilian Digital Television System (SBTVD), supporting the Low Complexity, High Efficiency version 1 and High Efficiency version 2 profiles. A detailed study of the algorithm is presented, ranging from psychoacoustics concepts like masking to a review of the AAC bitstream decoding process, always keeping in mind the SBTVD. A digital hardware architecture is proposed, in which the algorithm is split in two separate blocks, one of them containing the Filter Bank, considered the most demanding task. The other block is responsible for decoding the coded spectrum, which contains the second most demanding task of the system: the Huffman decoding. In the final part of this work the conversion of the proposed architecture into VHDL modules meant to be prototyped with an Altera Cyclone II FPGA is described. Microeletrônica Áudio digital Fpga MPEG-4 AAC SBTVD Digital audio AAC LC Audio decoding FPGA
55	Arquitetura de um decodificador de áudio para o Sistema Brasileiro de Televisão Digital e sua implementação em FPGA Renner, Adriano January 2011 (has links) O Sistema Brasileiro de Televisão Digital estabeleceu como padrão de codificação de áudio o algoritmo MPEG-4 Advanced Audio Coding, mais precisamente nos perfis Low Complexity, High Efficiency versão 1 e High Efficiency versão 2. O trabalho apresenta um estudo detalhado sobre o padrão, contendo desde alguns conceitos da psicoacústica como o mascaramento até a metodologia de decodificação do stream codificado, sempre voltado para o mercado do SBTVD. É proposta uma arquitetura em hardware para um decodificador compatível com o padrão MPEG-4 AAC LC. O decodificador é separado em dois grandes blocos mantendo em um deles o banco de filtros, considerado a parte mais custosa em termos de processamento. No bloco restante é realizada a decodificação do espectro, onde ocorre a decodificação dos códigos de Huffman, o segundo ponto crítico do algoritmo em termos de demandas computacionais. Por fim é descrita a implementação da arquitetura proposta em VHDL para prototipação em um FPGA da família Cyclone II da Altera. / MPEG-4 Advanced Audio Coding is the chosen algorithm for the Brazilian Digital Television System (SBTVD), supporting the Low Complexity, High Efficiency version 1 and High Efficiency version 2 profiles. A detailed study of the algorithm is presented, ranging from psychoacoustics concepts like masking to a review of the AAC bitstream decoding process, always keeping in mind the SBTVD. A digital hardware architecture is proposed, in which the algorithm is split in two separate blocks, one of them containing the Filter Bank, considered the most demanding task. The other block is responsible for decoding the coded spectrum, which contains the second most demanding task of the system: the Huffman decoding. In the final part of this work the conversion of the proposed architecture into VHDL modules meant to be prototyped with an Altera Cyclone II FPGA is described. Microeletrônica Áudio digital Fpga MPEG-4 AAC SBTVD Digital audio AAC LC Audio decoding FPGA
56	Towards Optimal Quality of Experience via Scalable Video Coding Ni, Pengpeng January 2009 (has links) To provide universal multimedia experience, multimedia streaming services need to transparently handle the variation and heterogeneity in operating environment. From the standpoint of streaming application, video adaptation techniques are intended to cope with the environmental variations by utilizing manipulations of the video content itself. Scalable video coding (SVC) schemes, like that suggested by the standards H.264 and its SVC extension, is highly attractive for designing a self-adaptive video streaming system. When SVC is employed in streaming system, the produced video stream can be then easily truncated or tailored to form several sub-streams which can be decoded separately to obtain a range of preferable picture size, quality and frame rate. However, questions about how to perform the adaptation using SVC and how much adaptation SVC enables are still remaining research issues. We still lack a thorough understanding of how to automate the scaling procedure in order to achieve an optimal video Quality-of-Experience for end users. Video QoE, depends highly on human perception. In this thesis, we introduce several video QoE studies around the usability of H.264 SVC. Several factors that contribute significantly to the overall QoEs have been identified and evaluated in these studies. As an example of application usage related factor, playback smoothness and application response time are critical performance measures which can benefit from temporal scalability. Targeting on applications that requires frequent interactivity, we propose a transcoding scheme that fully utilizes the benefits of Switching P and Switching I frames specified in H.264 to enhance video stream's temporal scalability. Focusing on visual quality related factors, a series of carefully designed subjective quality assessment tests have been performed on mobile devices to investigate the effects of multi-dimensional scalability on human quality perception. Our study reveals that QoE degrades non-monotonically with bitrate and that scaling order preferences are content-dependent. Another study find out that the flickering effect caused by frequent switching between layers in SVC compliant bit-streams is highly related to the switching period. When the period is above a certain threshold, the flickering effect will disappear and layer switching should not be considered as harmful. We have also examined user perceived video quality in 3D virtual worlds. Our results show that the avatars' distance to the virtual screen in 3D worlds contribute significant to the video QoE, i.e., for a wide extent of distortion, there exists always a feasible virtual distance from where the distortion is not detectable for most of people, which makes sense to perform video adaptation. The work presented in this thesis is supposed to help improving the design of self adaptive video streaming services that can deliver video content independently of network technology and end-device capability while seeking the best possible experience for video. / Ardendo småföretagsdoktorand Quality-of-Experience advanced video coding MPEG-4 H264/AVC Computer Sciences Datavetenskap (datalogi)
57	Vysílání multimediálního obsahu s využití kompresních technik / Streaming of compressed multimedia content Tesař, Pavel January 2009 (has links) The topic of the master's thesis is about transmission of multimedia via network with help of compression’s algorithm used in codec MPEG-4 part 10 and Real Time Protocol (RTP). First part of this master thesis will be familiarization with basic terms about multimedia. It will be queried about techniques of compression axioms for audio and video, justification for necessary use of compression in transmitting multimedia via network. Also, here will be subscribed principles and properties of Real Time Protocol, which was designed for multimedia stream. From the start, competent codec for implementation of two applications (client, server) with read raw video and audio data, will choose the following : compression with help of codec Theora (Vorbis), transfer via network in RTP packets, decode and play at visual form for end user. Complete subscription of implementation for both applications written in programming languages C/C++ and Java, will be certain. The basic overview about this problematic experience with implementation, which were here concluded, are the reasons for reading this master thesis.
58	[en] A SYSTEM FOR GENERATING DYNAMIC FACIAL EXPRESSIONS IN 3D FACIAL ANIMATION WITH SPEECH PROCESSING / [pt] UM SISTEMA DE GERAÇÃO DE EXPRESSÕES FACIAIS DINÂMICAS EM ANIMAÇÕES FACIAIS 3D COM PROCESSAMENTO DE FALA PAULA SALGADO LUCENA RODRIGUES 24 April 2008 (has links) [pt] Esta tese apresenta um sistema para geração de expressões faciais dinâmicas sincronizadas com a fala em uma face realista tridimensional. Entende-se por expressões faciais dinâmicas aquelas que variam ao longo do tempo e que semanticamente estão relacionadas às emoções, à fala e a fenômenos afetivos que podem modificar o comportamento de uma face em uma animação. A tese define um modelo de emoção para personagens virtuais falantes, de- nominado VeeM (Virtual emotion-to-expression Model ), proposto a partir de uma releitura e uma reestruturação do modelo do círculo emocional de Plutchik. O VeeM introduz o conceito de um hipercubo emocional no espaço canônico do R4 para combinar emoções básicas, dando origem a emoções derivadas. Para validação do VeeM é desenvolvida uma ferramenta de autoria e apresentação de animações faciais denominada DynaFeX (Dynamic Facial eXpression), onde um processamento de fala é realizado para permitir o sincronismo entre fonemas e visemas. A ferramenta permite a definição e o refinamento de emoções para cada quadro ou grupo de quadros de uma animação facial. O subsistema de autoria permite também, alternativamente, uma manipulação em alto-nível, através de scripts de animação. O subsistema de apresentação controla de modo sincronizado a fala da personagem e os aspectos emocionais editados. A DynaFeX faz uso de uma malha poligonal tridimensional baseada no padrão MPEG-4 de animação facial, favorecendo a interoperabilidade da ferramenta com outros sistemas de animação facial. / [en] This thesis presents a system for generating dynamic facial expressions synchronized with speech, rendered using a tridimensional realistic face. Dynamic facial expressions are those temporal-based facial expressions semanti- cally related with emotions, speech and affective inputs that can modify a facial animation behavior. The thesis defines an emotion model for speech virtual actors, named VeeM (Virtual emotion-to-expression Model ), which is based on a revision of the emotional wheel of Plutchik model. The VeeM introduces the emotional hypercube concept in the R4 canonical space to combine pure emotions and create new derived emotions. In order to validate VeeM, it has been developed an authoring and player facial animation tool, named DynaFeX (Dynamic Facial eXpression), where a speech processing is realized to allow the phoneme and viseme synchronization. The tool allows either the definition and refinement of emotions for each frame, or group of frames, as the facial animation edition using a high-level approach based on animation scripts. The tool player controls the animation presentation synchronizing the speech and emotional features with the virtual character performance. DynaFeX is built over a tridimensional polygonal mesh, compliant with MPEG-4 facial animation standard, what favors tool interoperability with other facial animation systems. [pt] MPEG-4 [en] MPEG-4 [pt] ANIMACAO FACIAL [en] FACIAL ANIMATION [pt] EXPRESSOES FACIAIS DINAMICAS [en] DYNAMIC FACIAL EXPRESSIONS [pt] PROCESSAMENTO DE FALA [en] SPEECH PROCESSING [pt] MODELO DE EMOCAO [en] EMOTION MODEL [pt] HIPERCUBO EMOCIONAL [en] EMOTIONAL HYPERCUBE [pt] PERSONAGENS VIRTUAIS FALANTES [en] VIRTUAL TALKING CHARACTERS
59	[en] INTEGRATION AND INTEROPERABILITY OF MPEG-4 AND NCL DOCUMENTS / [pt] INTEGRAÇÃO E INTEROPERABILIDADE DE DOCUMENTOS MPEG-4 E NCL ROMUALDO MONTEIRO DE RESENDE COSTA 27 June 2005 (has links) [pt] A abordagem orientada a objetos do padrão MPEG-4, para a codificação de conteúdo audiovisual, é similar às utilizadas em vários modelos e linguagens de especificação de documentos multimídia/hipermídia. Entre essas linguagens, a NCL (Nested Context Language), utilizada no sistema HyperProp, introduz uma série de novos conceitos que podem ser integrados ao padrão, com vantagens. Esta dissertação propõe, inicialmente, a conversão de documentos especificados em NCL para MPEG-4 (XMT-O) e vice-versa, permitindo que ferramentas de autoria e formatação possam ser utilizadas na especificação e exibição de documentos de ambas as linguagens. Este trabalho também propõe a incorporação de cenas MPEG-4 tanto como objetos de mídia quanto composições da linguagem NCL, permitindo o estabelecimento de relacionamentos entre cenas. Para permitir a exibição desses novos objetos NCL, é incorporado ao Formatador HyperProp um exibidor MPEG-4 capaz de reportar ao controlador a ocorrência de eventos que, entre outras coisas, permite o sincronismo entre cenas MPEG-4 e outros objetos NCL, incluindo outras cenas MPEG-4. Por fim, explorando o conceito de templates introduzido pela linguagem NCL, a capacidade de autoria no MPEG-4 é estendida, através da definição de novas semânticas para as composições da linguagem XMT-O e da concepção de compiladores para essa linguagem. / [en] The MPEG-4 standard object-oriented approach, employed to the encoding of audiovisual content, is similar to those used on many models and languages for multimedia/hypermedia document specification. Among those languages, the NCL (Nested Context Language), used in the HyperProp system, introduces a series of new concepts that can be integrated to the standard, with advantages. Initially, the proposal of this work is to convert NCL to MPEG-4 (XMT-O) documents and vice versa, allowing authoring and formatting tools to be used in the specification and presentation of documents in both languages. This work also proposes both the placing of MPEG-4 scenes as media objects and NCL language compositions, allowing the establishment of relationships among scenes. In order to allow displaying these new NCL objects, an MPEG-4 player is incorporated to the HyperProp Formatter. The MPEG-4 player is able to report to the controller the occurrence of events that, among other things, allows the synchronization between MPEG-4 scenes and othe r NCL objects, including other MPEG-4 scenes. Finally, exploring the concept of templates, introduced by the NCL language, the authoring in the MPEG-4 is improved, by means of the definition of new semantics for XMT-O language compositions and the design of compilers for this language. [pt] INTEGRACAO [en] INTEGRATION [pt] SISTEMAS HIPERMIDIA [en] HYPERMEDIA SYSTEMS [pt] AUTORIA [en] AUTHORSHIP [pt] TEMPLATES [en] TEMPLATES [pt] MPEG-4 [en] MPEG-4 [pt] NCL [en] NCL [pt] XMT-O [en] XMT-O [pt] XMT-A [en] XMT-A [pt] BIFS [en] BIFS
60	Visual saliency extraction from compressed streams / Extraction de la saillance visuelle à partir de flux compressés Ammar, Marwa 15 June 2017 (has links) Les fondements théoriques pour la saillance visuelle ont été dressés, il y a 35 ans, par Treisman qui a proposé "feature-integration theory" pour le système visuel humain: dans n’importe quel contenu visuel, certaines régions sont saillantes en raison de la différence entre leurs caractéristiques (intensité, couleur, texture, et mouvement) et leur voisinage. Notre thèse offre un cadre méthodologique et expérimental compréhensif pour extraire les régions saillantes directement des flux compressés (MPEG-4 AVC et HEVC), tout en minimisant les opérations de décodage. L’extraction de la saillance visuelle à partir du flux compressé est à priori une contradiction conceptuelle. D’une part, comme suggéré par Treisman, dans un contenu vidéo, la saillance est donnée par des singularités visuelles. D’autre part, afin d’éliminer la redondance visuelle, les flux compressés ne devraient plus préserver des singularités. La thèse souligne également l’avantage pratique de l’extraction de la saillance dans le domaine compressé. Dans ce cas, nous avons démontré que, intégrée dans une application de tatouage robuste de la vidéo compressée, la carte saillance agit comme un outil d’optimisation, ce qui permet d’augmenter la transparence (pour une quantité d’informations insérées et une robustesse contre les attaques prescrites) tout en diminuant la complexité globale du calcul. On peut conclure que la thèse démontre aussi bien méthodologiquement que expérimentalement que même si les normes MPEG-4 AVC et HEVC ne dépendent pas explicitement d’aucun principe de saillance visuelle, leurs flux préservent cette propriété remarquable reliant la représentation numérique de la vidéo au mécanisme psycho-cognitifs humains / The theoretical ground for visual saliency was established some 35 years ago by Treisman who advanced the integration theory for the human visual system: in any visual content, some regions are salient (appealing) because of the discrepancy between their features (intensity, color, texture, motion) and the features of their surrounding areas. This present thesis offers a comprehensive methodological and experimental framework for extracting the salient regions directly from video compressed streams (namely MPEG-4 AVC and HEVC), with minimal decoding operations. Note that saliency extraction from compressed domain is a priori a conceptual contradiction. On the one hand, as suggested by Treisman, saliency is given by visual singularities in the video content. On the other hand, in order to eliminate the visual redundancy, the compressed streams are no longer expected to feature singularities. The thesis also brings to light the practical benefit of the compressed domain saliency extraction. In this respect, the case of robust video watermarking is targeted and it is demonstrated that the saliency acts as an optimization tool, allowing the transparency to be increased (for prescribed quantity of inserted information and robustness against attacks) while decreasing the overall computational complexity. As an overall conclusion, the thesis methodologically and experimentally demonstrates that although the MPEG-4 AVC and the HEVC standards do not explicitly rely on any visual saliency principle, their stream syntax elements preserve this remarkable property linking the digital representation of the video to sophisticated psycho-cognitive mechanisms Système visuel humain Extraction de la saillance visuelle Domaine compressé MPEG-4 AVC HEVC Carte de saillance Carte de fixation Emplacements saccades Tatouage numérique Human visual system Visual saliency extraction Compressed stream MPEG-4 AVC HEVC Saliency map Fixation map Saccade location Watermarking

Search results