Global ETD Search

1	Novel entropy coding and its application of the compression of 3D image and video signals Amal, Mehanna January 2013 (has links) The broadcast industry is moving future Digital Television towards Super high resolution TV (4k or 8k) and/or 3D TV. This ultimately will increase the demand on data rate and subsequently the demand for highly efficient codecs. One of the techniques that researchers found it one of the promising technologies in the industry in the next few years is 3D Integral Image and Video due to its simplicity and mimics the reality, independently on viewer aid, one of the challenges of the 3D Integral technology is to improve the compression algorithms to adequate the high resolution and exploit the advantages of the characteristics of this technology. The research scope of this thesis includes designing a novel coding for the 3D Integral image and video compression. Firstly to address the compression of 3D Integral imaging the research proposes novel entropy coding which will be implemented first on 2D traditional images content in order to compare it with the other traditional common standards then will be applied on 3D Integra image and video. This approach seeks to achieve high performance represented by high image quality and low bit rate in association with low computational complexity. Secondly, new algorithm will be proposed in an attempt to improve and develop the transform techniques performance, initially by using a new adaptive 3D-DCT algorithm then by proposing a new hybrid 3D DWT-DCT algorithm via exploiting the advantages of each technique and get rid of the artifact that each technique of them suffers from. Finally, the proposed entropy coding will be further implemented to the 3D integral video in association with another proposed algorithm that based on calculating the motion vector on the average viewpoint for each frame. This approach seeks to minimize the complexity and reduce the speed without affecting the Human Visual System (HVS) performance. Number of block matching techniques will be used to investigate the best block matching technique that is adequate for the new proposed 3D integral video algorithm. 621.36 Image processing ; Video processing
2	Foreground detection of video through the integration of novel multiple detection algorithims Nawaz, Muhammad January 2013 (has links) The main outcomes of this research are the design of a foreground detection algorithm, which is more accurate and less time consuming than existing algorithms. By the term accuracy we mean an exact mask (which satisfies the respective ground truth value) of the foreground object(s). Motion detection being the prior component of foreground detection process can be achieved via pixel based and block based methods, both of which have their own merits and disadvantages. Pixel based methods are efficient in terms of accuracy but a time consuming process, so cannot be recommended for real time applications. On the other hand block based motion estimation has relatively less accuracy but consumes less time and is thus ideal for real-time applications. In the first proposed algorithm, block based motion estimation technique is opted for timely execution. To overcome the issue of accuracy another morphological based technique was adopted called opening-and-closing by reconstruction, which is a pixel based operation so produces higher accuracy and requires lesser time in execution. Morphological operation opening-and-closing by reconstruction finds the maxima and minima inside the foreground object(s). Thus this novel simultaneous process compensates for the lower accuracy of block based motion estimation. To verify the efficiency of this algorithm a complex video consisting of multiple colours, and fast and slow motions at various places was selected. Based on 11 different performance measures the proposed algorithm achieved an average accuracy of more than 24.73% than four of the well-established algorithms. Background subtraction, being the most cited algorithm for foreground detection, encounters the major problem of proper threshold value at run time. For effective value of the threshold at run time in background subtraction algorithm, the primary component of the foreground detection process, motion is used, in this next proposed algorithm. For the said purpose the smooth histogram peaks and valley of the motion were analyzed, which reflects the high and slow motion areas of the moving object(s) in the given frame and generates the threshold value at run time by exploiting the values of peaks and valley. This proposed algorithm was tested using four recommended video sequences including indoor and outdoor shoots, and were compared with five high ranked algorithms. Based on the values of standard performance measures, the proposed algorithm achieved an average of more than 12.30% higher accuracy results. 006.3
3	Information theoretic methods in distributed compression and visual quality assessment Soundararajan, Rajiv 11 July 2012 (has links) Distributed compression and quality assessment (QA) are essential ingredients in the design and analysis of networked signal processing systems with voluminous data. Distributed source coding techniques enable the efficient utilization of available resources and are extremely important in a multitude of data intensive applications including image and video. The quality analysis of such systems is also equally important in providing benchmarks on performance leading to improved design and control. This dissertation approaches the complementary problems of distributed compression and quality assessment using information theoretic methods. While such an approach provides intuition on designing practical coding schemes for distributed compression, it directly yields image and video QA algorithms with excellent performance that can be employed in practice. This dissertation considers the information theoretic study of sophisticated problems in distributed compression including, multiterminal multiple description coding, multiterminal source coding through relays and joint source channel coding of correlated sources over wireless channels. Random and/or structured codes are developed and shown to be optimal or near optimal through novel bounds on performance. While lattices play an important role in designing near optimal codes for multiterminal source coding through relays and joint source channel coding over multiple access channels, time sharing random Gaussian codebooks is optimal for a wide range of system parameters in the multiterminal multiple description coding problem. The dissertation also addresses the challenging problem of reduced reference image and video QA. A family of novel reduced reference image and video QA algorithms are developed based on spatial and temporal entropic differences. While the QA algorithms for still images only compute spatial entropic differences, the video QA algorithms compute both spatial and temporal entropic differences and combine them in a perceptually relevant manner. These algorithms attain excellent performances in terms of correlation with human judgments of quality on large QA databases. The framework developed also enables the study of the degradation in performance of QA algorithms from full reference information to almost no information from the reference image or video. / text Information theory Image and video processing Distributed compression Quality assessment
4	Architectural Enhancements for Color Image and Video Processing on Embedded Systems Kim, Jongmyon 21 April 2005 (has links) As emerging portable multimedia applications demand more and more computational throughput with limited energy consumption, the need for high-efficiency, high-throughput embedded processing is becoming an important challenge in computer architecture. In this regard, this dissertation addresses application-, architecture-, and technology-level issues in existing processing systems to provide efficient processing of multimedia in many, or ideally all, of its form. In particular, this dissertation explores color imaging in multimedia while focusing on two architectural enhancements for memory- and performance-hungry embedded applications: (1) a pixel-truncation technique and (2) a color-aware instruction set (CAX) for embedded multimedia systems. The pixel-truncation technique differs from previous techniques (e.g., 4:2:2 and 4:2:0 subsampling) used in image and video compression applications (e.g., JPEG and MPEG) in that it reduces the information content in individual pixel word sizes rather than in each dimension. Thus, this technique drastically reduces the bandwidth and memory required to transport and store color images without perceivable distortion in color. At the same time, it maintains the pixel storage format of color image processing in which each pixel computation is performed simultaneously on 3-D YCbCr components, which are widely used in the image and video processing community. CAX supports parallel operations on two-packed 16-bit (6:5:5) YCbCr data in a 32-bit datapath processor, providing greater concurrency and efficiency for processing color image sequences. This dissertation presents the impact of CAX on processing performance and on both area and energy efficiency for color imaging applications in three major processor architectures: dynamically scheduled (superscalar), statically scheduled (very long instruction word, VLIW), and embedded single instruction multiple data (SIMD) array processors. Unlike typical multimedia extensions, CAX obtains substantial performance and code density improvements through direct support for color data processing rather than depending solely on generic subword parallelism. In addition, the ability to reduce data format size reduces system cost. The reduction in data bandwidth also simplifies system design. In summary, CAX, coupled with the pixel-truncation technique, provides an efficient mechanism that meets the computational requirements and cost goals for future embedded multimedia products. Data parallel architectures Superscalar processors Embedded systems Subword parallelism Computer architecture Color image and video processing
5	Automatic emotional state detection and analysis on embedded devices Turabzadeh, Saeed January 2015 (has links) From the last decade, studies on human facial emotion recognition revealed that computing models based on regression modelling can produce applicable performance. In this study, an automatic facial expression real-time system was built and tested. The method is used in this study has been used widely in different areas such as Local Binary Pattern method, which has been used in many research projects in machine vision, and the K-Nearest Neighbour algorithm is method utilized for regression modelling. In this study, these two techniques has been used and implemented on the FPGA for the first time, on the side and joined together to great the model in such way to display a continues and automatic emotional state detection model on the monitor. To evaluate the effectiveness of the classifier technique for human emotion recognition from video, the model was designed and tested on MATLAB environment and then MATLAB Simulink environment that is capable of recognizing continuous facial expression in real time with a rate of 1 frame per second and implemented on a desktop PC. It has been evaluated in a testing dataset and the experimental results were promising with the accuracy of 51.28%. The datasets and labels used in this study are made from videos which, recorded twice from 5 participants while watching a video. In order to implement it in real-time in faster frame rate, the facial expression recognition system was built on FPGA. The model was built on Atlys™ Spartan-6 FPGA Development Board. It can perform continuously emotional state recognition in real time at a frame rate of 30 with the accuracy of 47.44%. A graphic user interface was designed to display the participant video in real time and also two dimensional predict labels of the emotion at the same time. This is the first time that automatic emotional state detection has been successfully implemented on FPGA by using LBP and K-NN techniques in such way to display a continues and automatic emotional state detection model on the monitor. 006.3
6	Zpracování obrazu a videa na mobilních telefonech / Image and Video Processing on Mobile Phones Gazdík, Martin Unknown Date (has links) This paper deals with image and video processing on Symbian OS smartphones. Description of required development tools is given, and pros and cons of existing image processing applications are discussed. Afterwards, a new application, fast image viewer and editor, is designed keeping disadvantages of similar applications in mind. Purpose of this work is to make simple and fast tool for easy manipulation with integrated camera and captured images. Results and future development directions are at the end.
7	Algorithmic Rectification of Visual Illegibility under Extreme Lighting Li, Zhenhao January 2018 (has links) Image and video enhancement, a classical problem of signal processing, has remained a very active research topic for past decades. This technical subject will not become obsolete even as the sensitivity and quality of modern image sensors steadily improve. No matter what level of sophistication cameras reach, there will always be more extreme and complex lighting conditions, in which the acquired images are improperly exposed and thus need to be enhanced. The central theme of enhancement is to algorithmically compensate for sensor limitations under ill lighting and make illegible details conspicuous, while maintaining a degree of naturalness. In retrospect, all existing contrast enhancement methods focus on heightening of spatial details in the luminance channel to fulfil the goal, with no or little consideration of the colour fidelity of the processed images; as a result they can introduce highly noticeable distortions in chrominance. This long-time much overlooked problem is addressed and systematically investigated by the thesis. We then propose a novel optimization-based enhancement algorithm, generating optimal tone mapping that not only makes maximal gain of contrast but also constrains tone and chrominance distortion, achieving superior output perceptual quality against severe underexposure and/or overexposure. Besides, we present a novel solution to restore images captured under more challenging backlit scenes, by combining the above enhancement method and feature-driven, machine learning based segmentation. We demonstrate the superior performance of the proposed method in terms of segmentation accuracy and restoration results over state-of-the-art methods. We also shed light on a common yet largely untreated video restoration problem called Yin-Yang Phasing (YYP), featured by involuntary, intense fluctuation in intensity and chrominance of an object as the video plays. We propose a novel video restoration technique to suppress YYP artifacts while retaining temporal consistency of objects appearance via inter-frame, spatially-adaptive optimal tone mapping. Experimental results are encouraging, pointing to an effective and practical solution to the problem. / Thesis / Doctor of Philosophy (PhD) Image and Video Processing Pattern Recognition Machine Learning Image Enhancement Image Segmentation
8	Analyse de l'hypovigilance au volant par fusion d'informations environnementales et d'indices vidéo / Driver hypovigilance analysis based on environmental information and video evidence Garcia garcia, Miguel 19 October 2018 (has links) L'hypovigilance du conducteur (que ce soit provoquée par la distraction ou la somnolence) est une des menaces principales pour la sécurité routière. Cette thèse s'encadre dans le projet Toucango, porté par la start-up Innov+, qui vise à construire un détecteur d'hypovigilance en temps réel basé sur la fusion d'un flux vidéo en proche infra-rouge et d'informations environnementales. L'objectif de cette thèse consiste donc à proposer des techniques d'extraction des indices pertinents ainsi que des algorithmes de fusion multimodale qui puissent être embarqués sur le système pour un fonctionnement en temps réel. Afin de travailler dans des conditions proches du terrain, une base de données en conduite réelle a été créée avec la collaboration de plusieurs sociétés de transports. Dans un premier temps, nous présentons un état de l'art scientifique et une étude des solutions disponibles sur le marché pour la détection de l'hypovigilance. Ensuite, nous proposons diverses méthodes basées sur le traitement d'images (pour la détection des indices pertinents sur la tête, yeux, bouche et visage) et de données (pour les indices environnementaux basés sur la géolocalisation). Nous réalisons une étude sur les facteurs environnementaux liés à l'hypovigilance et développons un système d'estimation du risque contextuel. Enfin, nous proposons des techniques de fusion multimodale de ces indices avec l'objectif de détecter plusieurs comportements d'hypovigilance : distraction visuelle ou cognitive, engagement dans une tâche secondaire, privation de sommeil, micro-sommeil et somnolence. / Driver hypovigilance (whether caused by distraction or drowsiness) is one of the major threats to road safety. This thesis is part of the Toucango project, hold by the start-up Innov+, which aims to build a real-time hypovigilance detector based on the fusion of near infra-red video evidence and environmental information. The objective of this thesis is therefore to propose techniques for extracting relevant indices as well as multimodal fusion algorithms that can be embedded in the system for real-time operation. In order to work near ground truth conditions, a naturalistic driving database has been created with the collaboration of several transport companies. We first present a scientific state of the art and a study of the solutions available on the market for hypovigilance detection. Then, we propose several methods based on image (for the detection of relevant indices on the head, eyes, mouth and face) and data processing (for environmental indices based on geolocation). We carry out a study on the environmental factors related to hypovigilance and develop a contextual risk estimation system. Finally, we propose multimodal fusion techniques of these indices with the objective of detecting several hypovigilance behaviors: visual or cognitive distraction, engagement in a secondary task, sleep deprivation, microsleep and drowsiness. Traitement de signaux biomédicaux Traitement d’images et vidéo Hypovigilance au volant Fusion de données Biomedical signal processing Image and video processing Driving drowsiness Data fusion 004 620
9	Protection de vidéo comprimée par chiffrement sélectif réduit / Protection of compressed video with reduced selective encryption Dubois, Loïc 15 November 2013 (has links) De nos jours, les vidéos et les images sont devenues un moyen de communication très important. L'acquisition, la transmission, l'archivage et la visualisation de ces données visuelles, que ce soit à titre professionnel ou privé, augmentent de manière exponentielle. En conséquence, la confidentialité de ces contenus est devenue un problème majeur. Pour répondre à ce problème, le chiffrement sélectif est une solution qui assure la confidentialité visuelle des données en ne chiffrant qu'une partie des données. Le chiffrement sélectif permet de conserver le débit initial et de rester conforme aux standards vidéo. Ces travaux de thèse proposent plusieurs méthodes de chiffrement sélectif pour le standard vidéo H.264/AVC. Des méthodes de réduction du chiffrement sélectif grâce à l'architecture du standard H.264/AVC sont étudiées afin de trouver le ratio de chiffrement minimum mais suffisant pour assurer la confidentialité visuelle des données. Les mesures de qualité objectives sont utilisées pour évaluer la confidentialité visuelle des vidéos chiffrées. De plus, une nouvelle mesure de qualité est proposée pour analyser le scintillement des vidéos au cours du temps. Enfin, une méthode de chiffrement sélectif réduit régulé par des mesures de qualité est étudiée afin d'adapter le chiffrement en fonction de la confidentialité visuelle fixée. / Nowadays, videos and images are major sources of communication for professional or personal purposes. Their number grow exponentially and the confidentiality of the content has become a major problem for their acquisition, transmission, storage, and display. In order to solve this problem, selective encryption is a solution which provides visual privacy by encrypting only a part of the data. Selective encryption preserves the initial bit-rate and maintains compliance with the syntax of the standard video. This Ph.D thesis offers several methods of selective encryption for H.264/AVC video standard. Reduced selective encryption methods, based on the H.264/AVC architecture, are studied in order to find the minimum ratio of encryption but sufficient to ensure visual privacy. Objective quality measures are used to assess the visual privacy of encrypted videos. In addition, a new quality measure is proposed to analyze the video flicker over time. Finally, a method for a reduced selective encryption regulated by quality measures is studied to adapt the encryption depending on the visual privacy fixed. Chiffrement sélectif H.264/avc Compression Cryptographie Mesures de qualité Traitement des images et de vidéos Selective Encryption H.264/avc Compression Cryptography Quality metrics Image and video processing
10	Example-based Rendering of Textural Phenomena Kwatra, Vivek 19 July 2005 (has links) This thesis explores synthesis by example as a paradigm for rendering real-world phenomena. In particular, phenomena that can be visually described as texture are considered. We exploit, for synthesis, the self-repeating nature of the visual elements constituting these texture exemplars. Techniques for unconstrained as well as constrained/controllable synthesis of both image and video textures are presented. For unconstrained synthesis, we present two robust techniques that can perform spatio-temporal extension, editing, and merging of image as well as video textures. In one of these techniques, large patches of input texture are automatically aligned and seamless stitched with each other to generate realistic looking images and videos. The second technique is based on iterative optimization of a global energy function that measures the quality of the synthesized texture with respect to the given input exemplar. We also present a technique for controllable texture synthesis. In particular, it allows for generation of motion-controlled texture animations that follow a specified flow field. Animations synthesized in this fashion maintain the structural properties like local shape, size, and orientation of the input texture even as they move according to the specified flow. We cast this problem into an optimization framework that tries to simultaneously satisfy the two (potentially competing) objectives of similarity to the input texture and consistency with the flow field. This optimization is a simple extension of the approach used for unconstrained texture synthesis. A general framework for example-based synthesis and rendering is also presented. This framework provides a design space for constructing example-based rendering algorithms. The goal of such algorithms would be to use texture exemplars to render animations for which certain behavioral characteristics need to be controlled. Our motion-controlled texture synthesis technique is an instantiation of this framework where the characteristic being controlled is motion represented as a flow field. Flow visualization Texture animation Energy minimization Markov random fields Video-based rendering Image-based rendering Texture synthesis Natural phenomena Image and video processing

Search results