• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 51
  • 16
  • 11
  • 5
  • 3
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 100
  • 100
  • 34
  • 31
  • 24
  • 19
  • 19
  • 17
  • 16
  • 16
  • 14
  • 14
  • 14
  • 13
  • 12
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
21

Personalização e adaptação de conteúdo baseadas em contexto para TV Interativa / Context-based content personalization and adaptation for Interactive TV

Rudinei Goularte 10 November 2003 (has links)
O trabalho apresentado nesta tese trata do desenvolvimento de técnicas com suporte à ciência de contexto, baseadas nos padrões MPEG-4 e MPEG-7, para personalizar e adaptar conteúdo em TV Interativa. Um dos desafios dessa área é desenvolvimento de programas personalizados com rico conteúdo multimídia, com alta interatividade e que, além disso, sejam acessíveis a partir de uma variedade de dispositivos (fixos ou móveis), atendendo às expectativas de interação e de acesso dos usuários. Grande parte do problema está no fato de que os modos encontrados na literatura para representar, descrever e compor programas de TV Interativa não oferecem suporte a contexto, não permitem a separação entre descrições de programas e descrições de objetos e possuem baixa granulosidade de segmentação. Essas características dificultam e, em alguns casos, impedem o desenvolvimento de aplicações avançadas em TV Interativa. As técnicas desenvolvidas neste trabalho são baseadas em esquemas de descrição, compatíveis com o padrão MPEG-7, e na segmentação de programas em objetos MPEG-4. Os esquemas são utilizados para descrever a estrutura, a composição e a semântica de programas e de seus objetos componentes. Também foi definida e implantada uma infra-estrutura para produção, distribuição e consumo de programas. A utilização conjunta da infra-estrutura e das técnicas permite o desenvolvimento de aplicações avançadas em TV Interativa. Como um exemplo dessas aplicações, foi desenvolvido um serviço automático para personalizar e adaptar programas de TV Interativa, permitindo que um usuário possa acessar, sob demanda, programas especialmente produzidos para ele, contendo apenas assuntos de seu interesse e permitindo que o acesso possa ser realizado por dispositivos fixos ou móveis. / The work presented in this thesis developed techniques with context-awareness support, based on the MPEG-4 and MPEG-7 standards, in order to personalize and to adapt Interactive TV content. One of the challenges in this area is the development of personalized programs with rich multimedia content, high interactivity and accessibility through a variety of devices (mobile and non-mobile). Most part of the problem is that the approaches found in literature do not provide context support, do not allow separation between programs and objects descriptions and have low level of segmentation granularity. These features make difficult or impossible, in some cases, the development of Interactive TV applications. The techniques developed in this work are based on MPEG-7 compliant schemes and on programs segmentation into MPEG-4 objects. The schemes are used to describe structure, composition and semantics of programs and component objects. An infra-structure to creation, delivery and consumption of Interactive TV programs was also defined. The joint utilization of infra-structure and techniques allows for the development of Interactive TV advanced applications. As an example of these applications, this work developed an automatic Interactive TV personalization and adaptation service. This service allows a user to access, on-demand, a program specially designed to match his interests and allowing content access through devices with mobile and non-mobile features.
22

La plate-forme RAMSES pour un triple écran interactif : application à la génération automatique de télévision interactive / The RAMSES platform for triple display : application to automatic generation of interactive television

Royer, Julien 16 December 2009 (has links)
Avec la révolution du numérique, l’usage de la vidéo a fortement évolué durant les dernières décennies, passant du cinéma à la télévision, puis au web, du récit fictionnel au documentaire et de l’éditorialisation à la création par l’utilisateur. Les médias sont les vecteurs pour échanger des informations, des connaissances, des « reportages » personnels, des émotions... L’enrichissement automatique des documents multimédias est toujours un sujet de recherche depuis l’avènement des médias. Dans ce contexte, nous proposons dans un premier temps une modélisation des différents concepts et acteurs mis en œuvre pour analyser automatiquement des documents multimédias afin de déployer dynamiquement des services interactifs en relation avec le contenu des médias. Nous définissons ainsi les concepts d’analyseur, de service interactif, de description d’un document multimédia et enfin les fonctions nécessaires pour faire interagir ceux-ci. Le modèle d’analyse obtenu se démarque de la littérature en proposant une architecture modulaire, ouverte et évolutive. Nous présentons ensuite l’implantation de ces concepts dans le cadre d’un prototype de démonstration. Ce prototype permet ainsi de mettre en avant les contributions avancées dans la description des modèles. Une implantation ainsi que des recommandations sont détaillées pour chacun des modèles. Afin de montrer les résultats d’implantation des solutions proposées sur la plateforme telles que les standards MPEG-7 pour la description, MPEG-4 BIFS pour les scènes interactives ou encore OSGI pour l’architecture générale, nous présentons différents exemples de services interactifs intégrés dans la plateforme. Ceux-ci permettent de vérifier les capacités d’adaptation aux besoins d’un ou plusieurs services interactifs. / The concept developed in this thesis is to propose an architecture model allowing automatic multimedia analysis and inserting pertinent interactive contents accordingly to multimedia content. Until nowadays, studies are mainly trying to provide tools and frameworks to generate a full description of the multimedia. It can be compared as trying to describe the world since the system must have huge description capabilities. Actually, it is not possible to represent the world through a tree of concepts and relationships due to time and computer limitations. Therefore, according to the amount of multimedia analyzers developed all over the world, this thesis proposes a platform able to host, combine and share existing multimedia analyzers. Furthermore, we only consider user’s requirements to select only required elements from multimedia platform to analyze the multimedia. In order to easily adapt the platform to the service requirements, we propose a modular architecture based on plug-in multimedia analyzers to generate the contextual description of the media. Besides, we provide an interactive scene generator to dynamically create related interactive scenes. We choose the MPEG-7 standard to implement the multimedia’s description and MPEG-4 BIFS standard to implement interactive scenes into multimedia. We also present experimental results on different kind of interactive services using video real time information extraction. The main implemented example of interactive services concerns an interactive mobile TV application related to parliament session. This application aims to provide additional information to users by inserting automatically interactive contents (complementary information, subject of the current session…) into original TV program. In addition, we demonstrate the capacity of the platform to adapt to multiple domain applications through a set of simple interactive services (goodies, games...).
23

Χρήση του προτύπου MPEG-4 ALS και διακαναλλική πρόβλεψη για κωδικοποίηση πολυκαναλλικού ηλεκτροκαρδιογραφήματος

Κωνσταντίνου, Ιωάννης 03 July 2009 (has links)
Είναι γεγονός ότι το ηλεκτροκαρδιογράφημα είναι ένα πολύ καλά μελετημένο σήμα. Ειδικά τα τελευταία χρόνια, έχει προταθεί ένας μεγάλος αριθμός αλγορίθμων επεξεργασίας, συμπίεσης, αυτόματης διάγνωσης, φιλτραρίσματος, αποθορυβοποίησης και κωδικοποίησης. Σ’ αυτή τη διπλωματική εργασία, προτείνουμε ένα αποδοτικό αλγόριθμο κωδικοποίησης χωρίς απώλειες για δεδομένα από δωδεκακάναλλο ηλεκτροκαρδιογράφημα. Ο κωδικοποιητής υλοποιεί ένα πολυγραμμικό μοντέλο υψηλής απόδοσης, το οποίο είναι «ειδικευμένο στους ασθενείς», ενδοκαναλικής πρόβλεψη και εφαρμόζει το πρότυπο κωδικοποίησης MPEG-4 ALS για διακαναλική πρόβλεψη και κωδικοποίηση. Τα αποτελέσματα του αλγορίθμου συγκρίθηκαν με τεχνικές κωδικοποίησης εντροπίας χωρίς απώλειες και δείχνουν αύξηση της απόδοσης κωδικοποίησης. / The Electrocardiogram (ECG) is one of the most well studied medical signals. A large number of ECG processing algorithms have being proposed over the years covering the areas of ECG noise filtering, automated diagnostic interpretation and coding. In this master thesis, we propose a robust multi-channel ECG encoder architecture, which operates on 12-channel ECG data. The encoder utilizes highly efficient multi-linear patient specific models for inter-channel prediction and the MPEG-4 Audio Lossless Coding (ALS) architecture for intra-channel prediction and coding. The results of the algorithm show improved performance over standard encoding techniques.
24

MPEG-4 AVC traffic analysis and bandwidth prediction for broadband cable networks

Lanfranchi, Laetitia I. 30 June 2008 (has links)
In this thesis, we analyze the bandwidth requirements of MPEG-4 AVC video traffic and then propose and evaluate the accuracy of new MPEG-4 AVC video traffic models. First, we analyze the bandwidth requirements of the videos by comparing the statistical characteristics of the different frame types. We analyze their coefficient of variability, autocorrelation, and crosscorrelation in both short and long term. The Hurst parameter is also used to investigate the long range dependence of the video traces. We then provide an insight into B-frame dropping and its impact on the statistical characteristics of the video trace. This leads us to design two algorithms that predict the size of the B-frame and the size of the group of pictures (GOP) in the short-term. To evaluate the accuracy of the prediction, a model for the error is proposed. In a broadband cable network, B-frame size prediction can be employed by a cable headend to provision video bandwidth efficiently or more importantly, reduce bit rate variability and bandwidth requirements via selective B-frame dropping, thereby minimizing buffering requirements and packet losses at the set top box. It will be shown that the model provides highly accurate prediction, in particular for movies encoded in high quality resolution. The GOP size prediction can be used to provision bandwidth. We then enhance the B-frame and GOP size prediction models using a new scene change detector metric. Finally, we design an algorithm that predicts the size of different frame types in the long-term. Clearly, a long-term prediction algorithm may suffer degraded prediction accuracy and the higher complexity may result in higher latency. However, this is offset by the additional time available for long-term prediction and the need to forecast bandwidth usage well ahead of time in order to minimize packet losses during periods of peak bandwidth demands. We also analyze the impact of the video quality and the video standard on the accuracy of the model.
25

Dispositif de rendu distant multimédia et sémantique pour terminaux légers collaboratifs / Semantic multimedia remote viewer for collaborative mobile thin clients

Joveski, Bojan 18 December 2012 (has links)
Développer un système de rendu distant pour terminaux légers et mobiles traitant d'objets multimédias et de leur sémantique consiste à (1) offrir une véritable expérience multimédia collaborative au niveau du terminal, (2) assurer la compatibilité avec les contraintes liées au réseau (bande passante, erreurs et latence variables en temps) et au terminal (ressources de calcul et de mémoire réduites) et (3) s'affranchir des types de terminaux et des spécificités des communautés.Cette thèse traite de ces enjeux et se positionne en rupture avec l'état de l'art en développant une architecture support fondée sur la gestion sémantique du contenu multimédia. Le principe consiste à convertir en temps réel le contenu graphique généré par l'application en un graphe de scène multimédia et à le gérer en fonction de la sémantique de ses composantes.L'optimisation de la bande passante est assurée par la compression adaptative du graphe de scène et par la compression sans perte des messages de collaboration. Les deux méthodes développées sont caractérisées respectivement par la création d'un unique graphe de scène intrinsèquement adaptable au réseau/terminal et par la mise à jour dynamique du dictionnaire de codage en fonction des messages générés par les utilisateurs. Elles sont brevetées.Les fonctionnalités collaboratives interviennent directement au niveau du contenu grâce à l'enrichissement du graphe de scène par un nouveau type de nœud, dont la normalisation ISO est en cours.Le démonstrateur logiciel sous-jacent, dénommé MASC (Multimedia Adaptive Semantic Collaboration), permet de comparer objectivement cette nouvelle architecture aux solutions actuellement déployées par des acteurs majeurs du domaine (VNC RBF ou Microsoft RDP). Deux types d'application ont été considérés : l'édition du texte et la navigation sur Internet. Les évaluations quantitatives montrent: (1) un impact limité des artéfacts visuels de conversion (PSNR compris entre 30 et 42 dB et SSIM supérieur à 0,9999), (2) consommation de la bande passante downlink (resp. uplink) réduite d'un facteur de 2 à 60 (resp. de 3 à 10), (3) latence dans la transmission des événements générés par l'utilisateur réduite d'un facteur de 4 à 6, (4) consommation des ressources de calcul côté client réduite d'un facteur 1,5 par rapport à VNC RFB. / Defining a multimedia remote viewer for mobile thin clients comes across with threefold scientific/technical constraints: (1) providing at the client side heterogeneous multimedia content and the support for ultimate collaboration functionalities, (2) ensuring a stable quality of services despite constrained resources available for the network and the terminal, and (3) featuring terminal independency and benefiting from community support.The present thesis addresses these challenges by developing a collaborative, semantic multimedia remote viewer. The underlying architecture features novel components for scene-graph creation and management, as well as for collaborative user events handling.The adaptive compression of the multimedia scene graph and the lossless compression of the collaborative messages are optimized through two devoted algorithms. The former creates a unique scene-graph, intrinsically adaptable to the network/terminal conditions. The latter dynamically generates and updates the encoding table according to the messages generated by the collaborative users. Both algorithms are patented.The direct collaborative functionality is ensured at the content level by enriching the scene graph with a new type of node where currently becomes a part of the ISO standards.The experimental setup considers the Linux X windows system and BiFS/LASeR multimedia scene technologies on the server and client sides, respectively. The implemented solution was objectively benchmarked against currently deployed solutions (VNC RFB and Microsoft RDP), by considering text-editing and www-browsing applications. The quantitative assessments demonstrate: (1) limited depreciation of the visual quality, e.g. PSNR values between 30 and 42dB or SSIM values larger than 0.9999; (2) downlink bandwidth gain factors ranging from 2 to 60; (3) efficient real-time user event management expressed by network roundtrip-time reduction by factors of 4 to 6 and by up-link bandwidth gain factors from 3 to 10; (4) feasible CPU activity, larger than in the Microsoft RDP case but reduced by a factor of 1.5 with respect to the VNC RFB.
26

MPEG-4 AVC stream watermarking / Tatouage du flux compressé MPEG-4 AVC

Hasnaoui, Marwen 28 March 2014 (has links)
La présente thèse aborde le sujet de tatouage du flux MPEG-4 AVC sur ses deux volets théoriques et applicatifs en considérant deux domaines applicatifs à savoir la protection du droit d’auteur et la vérification de l'intégrité du contenu. Du point de vue théorique, le principal enjeu est de développer un cadre de tatouage unitaire en mesure de servir les deux applications mentionnées ci-dessus. Du point de vue méthodologique, le défi consiste à instancier ce cadre théorique pour servir les applications visées. La première contribution principale consiste à définir un cadre théorique pour le tatouage multi symboles à base de modulation d’index de quantification (m-QIM). La règle d’insertion QIM a été généralisée du cas binaire au cas multi-symboles et la règle de détection optimale (minimisant la probabilité d’erreur à la détection en condition du bruit blanc, additif et gaussien) a été établie. Il est ainsi démontré que la quantité d’information insérée peut être augmentée par un facteur de log2m tout en gardant les mêmes contraintes de robustesse et de transparence. Une quantité d’information de 150 bits par minutes, soit environ 20 fois plus grande que la limite imposée par la norme DCI est obtenue. La deuxième contribution consiste à spécifier une opération de prétraitement qui permet d’éliminer les impactes du phénomène du drift (propagation de la distorsion) dans le flux compressé MPEG-4 AVC. D’abord, le problème a été formalisé algébriquement en se basant sur les expressions analytiques des opérations d’encodage. Ensuite, le problème a été résolu sous la contrainte de prévention du drift. Une amélioration de la transparence avec des gains de 2 dB en PSNR est obtenue / The present thesis addresses the MPEG-4 AVC stream watermarking and considers two theoretical and applicative challenges, namely ownership protection and content integrity verification.From the theoretical point of view, the thesis main challenge is to develop a unitary watermarking framework (insertion/detection) able to serve the two above mentioned applications in the compressed domain. From the methodological point of view, the challenge is to instantiate this theoretical framework for serving the targeted applications. The thesis first main contribution consists in building the theoretical framework for the multi symbol watermarking based on quantization index modulation (m-QIM). The insertion rule is analytically designed by extending the binary QIM rule. The detection rule is optimized so as to ensure minimal probability of error under additive white Gaussian noise distributed attacks. It is thus demonstrated that the data payload can be increased by a factor of log2m, for prescribed transparency and additive Gaussian noise power. A data payload of 150 bits per minute, i.e. about 20 times larger than the limit imposed by the DCI standard, is obtained. The thesis second main theoretical contribution consists in specifying a preprocessing MPEG-4 AVC shaping operation which can eliminate the intra-frame drift effect. The drift represents the distortion spread in the compressed stream related to the MPEG encoding paradigm. In this respect, the drift distortion propagation problem in MPEG-4 AVC is algebraically expressed and the corresponding equations system is solved under drift-free constraints. The drift-free shaping results in gain in transparency of 2 dB in PSNR
27

Enriched in-band video : from theoretical modeling to new services for the society of knowledge / In-band enriched video : de la modélisation théorique aux nouveaux services pour la société des connaissances

Belhaj Abdallah, Maher 05 December 2011 (has links)
Cette thèse a pour ambition d’explorer d’un point de vue théorique et applicatif le paradigme de l’in-band enrichment. Emergence de la société des connaissances, le concept de média enrichi renvoie à toute association de métadonnée (textuelle, audiovisuelle, code exécutable) avec un média d’origine. Un tel principe peut être déployé dans une large variété d’applications comme la TVNi - Télévision Numérique interactive, les jeux ou la fouille des données. Le concept de l’inband enrichement conçu et développé par M. Mitrea et son équipe au Département ARTEMIS de Télécom SudParis, suppose que les données d’enrichissement sont insérées dans le contenu même à enrichir. Ainsi, un tel concept peut-il tirer parti de techniques de tatouage, dès lors que celles-ci démontrent qu’elles ont la capacité d’insérer la quantité d’information requise par ce nouveau type d’application : i.e. 10 à 1000 fois plus grande que celle nécessaire pour les enjeux d’authentification ou de protection de droit d’auteur. Si par tradition la marque est insérée dans le domaine non compressé, les contraintes relatives aux nombreuses applications émergentes (comme la VoD – Vidéo à la Demande ou la TVNi) font du tatouage en temps réel dans le domaine compressé un important défi théorique et applicatif. Cependant, le tatouage dans le domaine compressé est une alliance de mots contradictoires puisque la compression (élimination de la redondance) rend l’hôte plus sensible aux modifications et l’association hôte/marque, plus fragile / The present thesis, developed at Institut Télécom Télécom SudParis under the “Futur et Rupture” framework, takes the challenge of exploring from both theoretical and applicative points of views the in band enrichment paradigm. Emerged with the knowledge society, the enriched media refers to any type of association which may be established between some metadata (textual, audio, video, exe codes...) and a given original media. Such a concept is currently deployed in a large variety of applications like the iDTV (interactive Digital TV), games, data mining... The incremental notion of in band enrichment advanced at the ARTEMIS Department assumes that the enrichment data are directly inserted into the very original media to be enriched. In real life, in band enrichment can be supported by the watermarking technologies, assuming they afford a very large data payload, i.e. 10 to 1000 larger than the traditional copyright applications. The nowadays advent of the ubiquous media computing and storage applications imposes an additional constraint on the watermarking techniques: the enrichment data should be inserted into some compressed original media. A priori, such a requirement is a contradiction in terms, as compression eliminates the visual redundancy while the watermarking exploits the visual redundancy in order to imperceptibly insert the mark
28

Efficient compression of synthetic video

Mazhar, Ahmad Abdel Jabbar Ahmad January 2013 (has links)
Streaming of on-line gaming video is a challenging problem because of the enormous amounts of video data that need to be sent during game playing, especially within the limitations of uplink capabilities. The encoding complexity is also a challenge because of the time delay while on-line gamers are communicating. The main goal of this research study is to propose an enhanced on-line game video streaming system. First, the most common video coding techniques have been evaluated. The evaluation study considers objective and subjective metrics. Three widespread video coding techniques are selected and evaluated in the study; H.264, MPEG-4 Visual and VP- 8. Diverse types of video sequences were used with different frame rates and resolutions. The effects of changing frame rate and resolution on compression efficiency and viewers' satisfaction are also presented. Results showed that the compression process and perceptual satisfaction are severely affected by the nature of the compressed sequence. As a result, H.264 showed higher compression efficiency for synthetic sequences and outperformed other codecs in the subjective evaluation tests. Second, a fast inter prediction technique to speed up the encoding process of H.264 has been devised. The on-line game streaming service is a real time application, thus, compression complexity significantly affects the whole process of on-line streaming. H.264 has been recommended for synthetic video coding by our results gained in codecs comparative studies. However, it still suffers from high encoding complexity; thus a low complexity coding algorithm is presented as fast inter coding model with reference management technique. The proposed algorithm was compared to a state of the art method, the results showing better achievement in time and bit rate reduction with negligible loss of fidelity. Third, recommendations on tradeoff between frame rates and resolution within given uplink capabilities are provided for H.264 video coding. The recommended tradeoffs are offered as a result of extensive experiments using Double Stimulus Impairment Scale (DSIS) subjective evaluation metric. Experiments showed that viewers' satisfaction is profoundly affected by varying frame rates and resolutions. In addition, increasing frame rate or frame resolution does not always guarantee improved increments of perceptual quality. As a result, tradeoffs are recommended to compromise between frame rate and resolution within a given bit rate to guarantee the highest user satisfaction. For system completeness and to facilitate the implementation of the proposed techniques, an efficient game video streaming management system is proposed. Compared to existing on-line live video service systems for games, the proposed system provides improved coding efficiency, complexity reduction and better user satisfaction.
29

Υλοποίηση του MPEG-4 Simple Profile CODEC στην πλατφόρμα TMS320DM6437 για επεξεργασία βίντεο σε πραγματικό χρόνο / Implementation of MPEG-4 Simple Profile CODEC in DSP platform TMS320DM6437 for video processing in real-time

Σωτηρόπουλος, Κωνσταντίνος 30 April 2014 (has links)
Η παρούσα ειδική ερευνητική εργασία εκπονήθηκε στα πλαίσια του Διατμηματικού Προγράμματος Μεταπτυχιακών Σπουδών Ειδίκευσης στα “Συστήματα Επεξεργασίας Σημάτων και Επικοινωνιών” στο Τμήμα Φυσικής του Πανεπιστημίου Πατρών. Αντικείμενο της παρούσας εργασίας είναι η σχεδίαση και ανάπτυξη του MPEG – 4 Simple Profile CODEC στο περιβάλλον Simulink με σκοπό την τελική εκτέλεση του αλγορίθμου DSP που θα προκύψει, στην πλατφόρμα ανάπτυξης TMS320DM6437 EVM. Στο πρώτο κεφάλαιο ορίζεται η έννοια της κωδικοποίησης βίντεο σε πραγματικό χρόνο και περιγράφεται η σύγχυση που επικρατεί γύρω από αυτήν. Επίσης γίνεται μια περιγραφή των επεξεργαστών ψηφιακού σήματος ως προς τα τυπικά χαρακτηριστικά που διαθέτουν, την αρχιτεκτονική τους, την αρχιτεκτονική μνήμης, τα στοιχεία υλικού που διαθέτουν για τη ροή του DSP προγράμματος, ενώ παράλληλα, παρουσιάζεται η ιστορική εξέλιξη των DSPs που οδήγησε στους σύγχρονους DSPs και οι οποίοι, διαθέτουν καλύτερες επιδόσεις από τους προπάτορές τους, και αυτό χάρη στις τεχνολογικές και αρχιτεκτονικές εξελίξεις όπως, οι χαμηλότεροι κανόνες σχεδίασης, η γρήγορη προσπέλαση κρυφής μνήμης δύο επιπέδων, η σχεδίαση του DMA και ενός μεγαλύτερου συστήματος διαύλου. Στο τέλος του κεφαλαίου παρουσιάζεται η αρχιτεκτονική της πλατφόρμας ανάπτυξης TMS320DM6437 EVM καθώς και οι διεπαφές υλικού που διαθέτει για την είσοδο και έξοδο βίντεο/ήχου από αυτήν. Στο δεύτερο κεφάλαιο γίνεται μια εκτενής παρουσίαση των εννοιών που συναντώνται στην κωδικοποίηση βίντεο. Στην αρχή του κεφαλαίου απεικονίζεται το γενικό μοντέλο ενός κωδικοποιητή/αποκωδικοποιητή και βάσει αυτού προχωράμε στην περιγραφή του χρονικού μοντέλου, το οποίο επιβάλλει την πρόβλεψη του τρέχοντος πλαισίου βίντεο χρησιμοποιώντας το προηγούμενο, ενώ παράλληλα, εξηγεί και μεθόδους για την εκτίμηση κίνησης περιοχών (μακρομπλοκ) μέσα στο πλαίσιο ενός βίντεο και το πώς μπορεί να γίνει ο υπολογισμός του σφάλματος κίνησης τους. Στη συνέχεια περιγράφεται το μοντέλο εικόνας το οποίο στην πράξη αποτελείται από τρία συστατικά μέρη: τον μετασχηματισμό (αποσυσχετίζει και συμπιέζει τα δεδομένα), την κβάντιση (μειώνει την ακρίβεια των μετασχηματισμένων δεδομένων) και την ανακατάταξη (ανακατατάσσει τα δεδομένα ούτως ώστε να ομαδοποιήσει μαζί τις σημαντικές τιμές). Οι συντελεστές του μετασχηματισμού μετά την ανακατάταξη και την κωδικοποίηση, μπορούν να κωδικοποιηθούν περαιτέρω με τη χρήση κωδικών μεταβλητού μήκους (Huffman κωδικοποίηση) ή μέσω αριθμητικής κωδικοποίησης. Στο τέλος του κεφαλαίου περιγράφεται το υβριδικό μοντέλο DPCM/DCT CODEC πάνω στον οποίο στηρίζεται και η υλοποίηση του MPEG – 4 Simple Profile CODEC. Στο τρίτο κεφάλαιο ουσιαστικά γίνεται μια περιγραφή των χαρακτηριστικών του MPEG – 4 Simple Profile CODEC, των εργαλείων που χρησιμοποιεί, της έννοιας αντικείμενο που πλέον υπεισέρχεται στην κωδικοποίηση βίντεο καθώς και τα είδη προφίλ και επιπέδων που υποστηρίζει το συγκεκριμένο πρωτόκολλο κωδικοποίησης/αποκωδικοποίησης. Στο τέταρτο κεφάλαιο παρουσιάζεται η υλοποίηση του κωδικοποιητή, του αποκωδικοποιητή του MPEG – 4 Simple Profile CODEC καθώς και των επιμέρους υποσυστημάτων που τους απαρτίζουν. Στο πέμπτο κεφάλαιο περιγράφεται η αλληλεπίδραση του χρήστη με το σύστημα κωδικοποίησης/αποκωδικοποίησης, τι παράμετροι χρειάζονται να δοθούν ως είσοδοι από αυτόν, καθώς και πως είναι δυνατή η χρήση του συγκεκριμένου συστήματος. / This project objective is the design and development of MPEG – 4 Simple Profile CODEC in Simulink environment in order to execute the resulting DSP algorithm on the development platform TMS320DM6437 EVM. The first chapter defines the term of real – time video coding which sometimes is misunderstood by most people. Besides there is a brief description of DSP systems, which includes information about their typical characteristics, their architecture, their memory architecture and the hardware elements provided with in order to support the flow of a DSP program. It is also presented the evolution of DSPs through time, which finally gave the modern DSPs with better performance than their ancestors thanks to the technological and architectonical improvements such as, lower design rules, fast-access two-level cache, (E)DMA circuitry and a wider bus system. At the end of this chapter it is presented the architecture of TMS320DM6437 EVM board and its input/output hardware interfaces for video and sound. At the second chapter there is an extensive presentation of terms found at the science of coding/decoding video. At the beginning of this chapter it is depicted a general model including a video encoder/decoder and this is the reason for the description of temporal model, which includes the prediction of current frame from the previous one, and at the same time it explains the computation methods of macroblock motion estimation and motion compensation. Continuing it is described the image model aparted from three component parts, the transformation (decorrelation and data compression), the quantization (reduces the accuracy of transformed data) and the reordering (reorders data on a way that groups significant values all together). The transform coefficients after reordering and coding, can be further coding by using variable length coding (Huffman coding) or arithmetic coding. At the end of the chapter the hybrid model of DPCM/DCT CODEC is described and this is the one where the implementation of MPEG – 4 Simple Profile CODEC has been set up. At the third chapter there is a description about the characteristics of MPEG – 4 Simple Profile CODEC, the tools used, the “object” term, which appears on video coding/decoding and also what are the profiles and levels supported by the specific video encoding/decoding protocol. Finally it is described how the coding of rectangular frames is done and the Simulink model of MPEG – 4 Simple Profile CODEC which is the base for the implementation of DSP algorithm executed on the development platform. At the forth chapter we present the implementation of MPEG – 4 Simple Profile CODEC encoder/decoder and their partial subsystems. At the fifth chapter it is described the interaction between user and the CODEC, what are the parameters needed to be entered as inputs and how the system can be used.
30

Approches orientées modèle pour la capture des mouvements du visage en vision par ordinateur

Malciu, Marius 01 December 2001 (has links) (PDF)
Modèle 3D d'objet, séquences vidéos monoscopiques, estimation de la pose 3D, recalage 3D/2D, texture, flot optique, translation et rotation de grande amplitude, occultation, appariement par bloc, interpolation temporelle, modélisation ondulatoire, critère de visibilité, analyse de déformations faciales, description MPEG-4 du visage, prototype déformable, bouche, yeux, B-splines, classification floue non supervisée, méthode du simplexe, synthèse de déformations faciales..

Page generated in 0.0234 seconds