Spelling suggestions: "subject:"[een] MPEG"" "subject:"[enn] MPEG""
311 |
Facial Features Tracking using Active Appearance ModelsFanelli, Gabriele January 2006 (has links)
This thesis aims at building a system capable of automatically extracting and parameterizing the position of a face and its features in images acquired from a low-end monocular camera. Such a challenging task is justified by the importance and variety of its possible applications, ranging from face and expression recognition to animation of virtual characters using video depicting real actors. The implementation includes the construction of Active Appearance Models of the human face from training images. The existing face model Candide-3 is used as a starting point, making the translation of the tracking parameters to standard MPEG-4 Facial Animation Parameters easy. The Inverse Compositional Algorithm is employed to adapt the models to new images, working on a subspace where the appearance is "projected out" and thus focusing only on shape. The algorithm is tested on a generic model, aiming at tracking different people’s faces, and on a specific model, considering one person only. In the former case, the need for improvements in the robustness of the system is highlighted. By contrast, the latter case gives good results regarding both quality and speed, with real time performance being a feasible goal for future developments.
|
312 |
Object Segmentation, Tracking And Skeletonization In MPEG VideoPadmashree, P 07 1900 (has links) (PDF)
No description available.
|
313 |
Grafické zobrazení relací mezi počítači v Internetu / Visualization of relations between computers in the InternetCimbálek, Přemysl January 2008 (has links)
Internet Protocol Television (IPTV) transmits the television signal over the TCP/IP family of protocols. Its advantages are for example that the transmitting is not only one-way as in the “classical” TV broadcasting, but it can provide feedback such as interactivity. There are also some problems which avoids development, for example low channel capacity of access networks. That is why new methods , for example how to get more efficiency in IPTV transmission, are proposed. The main task of this diploma thesis is to visualize tree structure of relations between nodes in the network, based on understanding of principles of the hierarchical summarization and IPTV transmitting. The nodes in the tree structure provide computing and summarizing of the data in back-way channel. There is the data from the end users in this channel. In the first part of this diploma thesis there is explained the principle of IPTV and its differences as compared with classical TV broadcasting. The part is also aimed for the supported services, advantages and disadvantages. There is explained the compressing data with the MPEG-2 and MPEG-4 standards and problems with transport networks called “last mile problem.” To transmitting data IPTV uses Source Specific Multicast – every user connects to the multicast session with requested TV program. Feedback is provided by unicast. Feedback network uses the hierarchical summarization principle to reduce the data. This problem, connected with RTP, RTCP and TTP protocols, is in the work described as well. There is an international experimental network called PlanetLab mentioned in theoretical part of this work. Proposed structure of new protocol and applications including the visualization for IPTV broadcast, is tested in that network. In the practical part of this work there are discussed possibilities and methods for the visualization and data storage. Because of high availability and flexibility, there were chosen web technologies, such as MySQL for data storage. The tree model is executed by Java. The visualization is solved by web technologies, source code for visualization is dynamically generated by scripts in JSP (Java Server Pages). Graphical output is provided by the vector format SVG (Scalable Vector Graphics) which is created for graphical expression on the internet and in the mobile phones. There were created interactive web application thanks its ability to cooperation with Javascript technology. This application visualizes relation-tree structure of nodes. In this work there are explained basics of all used technologies, there are also given reasons for chosen methods and formats. Examples and interesting parts of solution are mentioned as well.
|
314 |
Percepční kódování zvukových signálů / Perceptual Audio CodingNovák, Vladimír January 2011 (has links)
his thesis describes Perceptual Audio Coding of MPEG1 Layer 3 format (ISO/IEC 11172-3), principles and algorithms of psychoacoustic model. MATLAB application for modeling of Psychoacoustic model 2 of this audio format is developed.
|
315 |
Workshop Mensch-Computer-VernetzungHübner, Uwe 15 October 2003 (has links)
Workshop Mensch-Computer-Vernetzung
vom 14.-17. April 2003 in Löbsal (bei Meißen)
|
316 |
Construction et Présentation des Vidéos InteractivesHammoud, Riad 27 February 2001 (has links) (PDF)
L'arrivée de la norme MPEG-7 pour les vidéos exige la création de structures de haut niveau représentant leurs contenus. Le travail de cette thèse aborde l'automatisation de la fabrication d'une partie de ces structures. Comme point de départ, nous utilisons des outils de segmentation des objets en mouvement. Nos objectifs sont alors : retrouver des objets similaires dans la vidéo, utiliser les similarités entre plans caméras pour construire des regroupements de plans en scènes. Une fois ces structures construites, il est facile de fournir aux utilisateurs finaux des outils de visualisation de la vidéo permettant des navigations interactives : par exemple sauter au prochain plan ou scène contenant un personnage. La difficulté principale réside dans la grande variabilité des objets observés : changements de points de vues, d'échelles, occultations, etc. La contribution principale de cette thèse est la modélisation de la variabilité des observations par un mélange de densités basée sur la théorie du mélange gaussien. Cette modélisation permet de capturer les différentes apparences intra-plan de l'objet suivi et de réduire considérablement le nombre des descripteurs de bas niveaux à indexer par objet suivi. Autour de cette contribution se greffent des propositions qui peuvent être vues comme des mises en oeuvre de cette première pour différentes applications : mise en correspondance des objets suivis représentés par des mélanges gaussiens, fabrication initiale des catégories de tous les objets présents dans une vidéo par une technique de classification non supervisée, extraction de vues caractéristiques et utilisation de la détection d'objets similaires pour regrouper des plans en scènes.
|
317 |
Video transcoding using machine learningUnknown Date (has links)
The field of Video Transcoding has been evolving throughout the past ten years. The need for transcoding of video files has greatly increased because of the new upcoming standards which are incompatible with old ones. This thesis takes the method of using machine learning for video transcoding mode decisions and discusses ways to improve the process of generating the algorithm for implementation in different video transcoders. The transcoding methods used decrease the complexity in the mode decision inside the video encoder. Also methods which automate and improve results are discussed and implemented in two different sets of transcoders: H.263 to VP6 , and MPEG-2 to H.264. Both of these transcoders have shown a complexity loss of almost 50%. Video transcoding is important because the quantity of video standards have been increasing while devices usually can only decode one specific codec. / by Christopher Holder. / Thesis (M.S.C.S.)--Florida Atlantic University, 2008. / Includes bibliography. / Electronic reproduction. Boca Raton, Fla., 2008. Mode of access: World Wide Web.
|
318 |
Virtual human modelling and animation for real-time sign language visualisationVan Wyk, Desmond Eustin January 2008 (has links)
>Magister Scientiae - MSc / This thesis investigates the modelling and animation of virtual humans for real-time sign language visualisation. Sign languages are fully developed natural languages used by Deaf communities all over the world. These languages are communicated in a visual-gestural modality by the use of manual and non-manual gestures and are completely di erent from spoken languages. Manual gestures include the use of hand shapes, hand movements, hand locations and orientations of the palm in space. Non-manual gestures include the use of facial expressions, eye-gazes, head and upper body movements. Both manual and nonmanual gestures must be performed for sign languages to be correctly understood and interpreted. To e ectively visualise sign languages, a virtual human system must have models of adequate quality and be able to perform both manual and non-manual gesture animations in real-time. Our goal was to develop a methodology and establish an open framework by using various standards and open technologies to model and animate virtual humans of adequate quality to e ectively visualise sign languages. This open framework is to be used in a Machine Translation system that translates from a verbal language such as
English to any sign language. Standards and technologies we employed include H-Anim, MakeHuman, Blender, Python and SignWriting. We found it necessary to adapt and extend H-Anim to e ectively visualise sign languages. The adaptations and extensions we made to H-Anim include imposing joint rotational limits, developing exible hands and the addition of facial bones based on the MPEG-4 Facial De nition Parameters facial feature points for facial animation. By using these standards and technologies, we found
that we could circumvent a few di cult problems, such as: modelling high quality virtual humans; adapting and extending H-Anim; creating a sign language animation action vocabulary; blending between animations in an action vocabulary; sharing animation action data between our virtual humans; and e ectively visualising South African Sign Language. / South Africa
|
319 |
Error resilience for video coding services over packet-based networksZhang, Jian, Electrical Engineering, Australian Defence Force Academy, UNSW January 1999 (has links)
Error resilience is an important issue when coded video data is transmitted over wired and wireless networks. Errors can be introduced by network congestion, mis-routing and channel noise. These transmission errors can result in bit errors being introduced into the transmitted data or packets of data being completely lost. Consequently, the quality of the decoded video is degraded significantly. This thesis describes new techniques for minimising this degradation. To verify video error resilience tools, it is first necessary to consider the methods used to carry out experimental measurements. For most audio-visual services, streams of both audio and video data need to be simultaneously transmitted on a single channel. The inclusion of the impact of multiplexing schemes, such as MPEG 2 Systems, in error resilience studies is also an important consideration. It is shown that error resilience measurements including the effect of the Systems Layer differ significantly from those based only on the Video Layer. Two major issues of error resilience are investigated within this thesis. They are resynchronisation after error detection and error concealment. Results for resynchronisation using small slices, adaptive slice sizes and macroblock resynchronisation schemes are provided. These measurements show that the macroblock resynchronisation scheme achieves the best performance although it is not included in MPEG2 standard. The performance of the adaptive slice size scheme, however, is similar to that of the macroblock resynchronisation scheme. This approach is compatible with the MPEG 2 standard. The most important contribution of this thesis is a new concealment technique, namely, Decoder Motion Vector Estimation (DMVE). The decoded video quality can be improved significantly with this technique. Basically, this technique utilises the temporal redundancy between the current and the previous frames, and the correlation between lost macroblocks and their surrounding pixels. Therefore, motion estimation can be applied again to search in the previous picture for a match to those lost macroblocks. The process is similar to that the encoder performs, but it is in the decoder. The integration of techniques such as DMVE with small slices, or adaptive slice sizes or macroblock resynchronisation is also evaluated. This provides an overview of the performance produced by individual techniques compared to the combined techniques. Results show that high performance can be achieved by integrating DMVE with an effective resynchronisation scheme, even at a high cell loss rates. The results of this thesis demonstrate clearly that the MPEG 2 standard is capable of providing a high level of error resilience, even in the presence of high loss. The key to this performance is appropriate tuning of encoders and effective concealment in decoders.
|
320 |
Formalisation des connaissances documentaires et des connaissances conceptuelles à l'aide d'ontologies : application à la description de documents audiovisuelsTroncy, Raphaël 05 March 2004 (has links) (PDF)
La nature temporelle de l'audiovisuel impose de passer par le biais de la description pour enrichir les documents et donc les exploiter. Nous soutenons qu'une représentation de la structure et du contenu des documents est nécessaire. Par structure, nous entendons la structure documentaire c'est-à-dire l'organisation méréologique des éléments qui composent le document, tandis que le contenu est une structure conceptuelle, c'est-à-dire une catégorisation de ces éléments. Après une revue des propositions actuelles de modélisation des documents audiovisuels, issues de l'ingénierie documentaire et de l'ingénierie des connaissances, nous montrons qu'aucun des langages étudiés ne permet de traiter ces deux aspects de manière satisfaisante. Nous proposons alors une architecture générale permettant la représentation formelle de la structure et du contenu des documents audiovisuels, qui engendrera une base de connaissances sur laquelle il est possible d'effectuer des raisonnements. Cette architecture se compose d'une ontologie de l'audiovisuel, dont on traduit une partie dans un langage documentaire pour contrôler la structure logique des documents, et d'une ontologie de domaine pour décrire formellement leur contenu. Nous avons développé l'outil DOE (Differential Ontology Editor), qui implémente la méthodologie de construction d'ontologies utilisée. Nous montrons finalement la pertinence de l'approche à l'aide de deux expérimentations utilisant un corpus de vidéos annoté, illustrant ainsi les types d'inférences possibles.
|
Page generated in 0.1391 seconds