• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 3
  • 1
  • 1
  • 1
  • Tagged with
  • 6
  • 6
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Compression and Classification of Imagery

Tabesh, Ali January 2006 (has links)
Problems at the intersection of compression and statistical inference recur frequently due to the concurrent use of signal and image compression and classification algorithms in many applications. This dissertation addresses two such problems: statistical inference on compressed data, and rate-allocation for joint compression and classification.Features of the JPEG2000 standard make possible the development of computationally efficient algorithms to achieve such a goal for imagery compressed using this standard. We propose the use of the information content (IC) of wavelet subbands, defined as the number of bytes that the JPEG2000 encoder spends to compress the subbands, for content analysis. Applying statistical learning frameworks for detection and classification, we present experimental results for compressed-domain texture image classification and cut detection in video. Our results indicate that reasonable performance can be achieved, while saving computational and bandwidth resources. IC features can also be used for preliminary analysis in the compressed domain to identify candidates for further analysis in the decompressed domain.In many applications of image compression, the compressed image is to be presented to human observers and statistical decision-making systems. In such applications, the fidelity criterion with respect to which the image is compressed must be selected to strike an appropriate compromise between the (possibly conflicting) image quality criteria for the human and machine observers. We present tractable distortion measures based on the Bhattacharyya distance (BD) and a new upper bound on the quantized probability of error that make possible closed form expressions for rate allocation to image subbands and show their efficacy in maintaining the aforementioned balance between compression and classification. The new bound offers two advantages over the BD in that it yields closed-form solutions for rate-allocation in problems involving correlated sources and more than two classes.
2

Secure and Robust Compressed-Domain Video Watermarking for H.264

Noorkami, Maneli 05 June 2007 (has links)
The objective of this thesis is to present a robust watermarking algorithm for H.264 and to address challenges in compressed-domain video watermarking. To embed a perceptually invisible watermark in highly compressed H.264 video, we use a human visual model. We extend Watson's human visual model developed for 8x8 DCT block to the 4x4 block used in H.264. In addition, we use P-frames to increase the watermark payload. The challenge in embedding the watermark in P-frames is that the video bit rate can increase significantly. By using the structure of the encoder, we significantly reduce the increase in video bit rate due to watermarking. Our method also exploits both temporal and texture masking. We build a theoretical framework for watermark detection using a likelihood ratio test. This framework is used to develop two different video watermark detection algorithms; one detects the watermark only from watermarked coefficients and one detects the watermark from all the ac coefficients in the video. These algorithms can be used in different video watermark detection applications where the detector knows and does not know the precise location of watermarked coefficients. Both watermark detection schemes obtain video watermark detection with controllable detection performance. Furthermore, control of the detector's performance lies completely with the detector and does not place any burden on the watermark embedding system. Therefore, if the video has been attacked, the detector can maintain the same detection performance by using more frames to obtain its detection response. This is not the case with images, since there is a limited number of coefficients that can be watermarked in each image before the watermark is visible.
3

Transform Based And Search Aware Text Compression Schemes And Compressed Domain Text Retrieval

Zhang, Nan 01 January 2005 (has links)
In recent times, we have witnessed an unprecedented growth of textual information via the Internet, digital libraries and archival text in many applications. While a good fraction of this information is of transient interest, useful information of archival value will continue to accumulate. We need ways to manage, organize and transport this data from one point to the other on data communications links with limited bandwidth. We must also have means to speedily find the information we need from this huge mass of data. Sometimes, a single site may also contain large collections of data such as a library database, thereby requiring an efficient search mechanism even to search within the local data. To facilitate the information retrieval, an emerging ad hoc standard for uncompressed text is XML which preprocesses the text by putting additional user defined metadata such as DTD or hyperlinks to enable searching with better efficiency and effectiveness. This increases the file size considerably, underscoring the importance of applying text compression. On account of efficiency (in terms of both space and time), there is a need to keep the data in compressed form for as much as possible. Text compression is concerned with techniques for representing the digital text data in alternate representations that takes less space. Not only does it help conserve the storage space for archival and online data, it also helps system performance by requiring less number of secondary storage (disk or CD Rom) accesses and improves the network transmission bandwidth utilization by reducing the transmission time. Unlike static images or video, there is no international standard for text compression, although compressed formats like .zip, .gz, .Z files are increasingly being used. In general, data compression methods are classified as lossless or lossy. Lossless compression allows the original data to be recovered exactly. Although used primarily for text data, lossless compression algorithms are useful in special classes of images such as medical imaging, finger print data, astronomical images and data bases containing mostly vital numerical data, tables and text information. Many lossy algorithms use lossless methods at the final stage of the encoding stage underscoring the importance of lossless methods for both lossy and lossless compression applications. In order to be able to effectively utilize the full potential of compression techniques for the future retrieval systems, we need efficient information retrieval in the compressed domain. This means that techniques must be developed to search the compressed text without decompression or only with partial decompression independent of whether the search is done on the text or on some inversion table corresponding to a set of key words for the text. In this dissertation, we make the following contributions: (1) Star family compression algorithms: We have proposed an approach to develop a reversible transformation that can be applied to a source text that improves existing algorithm's ability to compress. We use a static dictionary to convert the English words into predefined symbol sequences. These transformed sequences create additional context information that is superior to the original text. Thus we achieve some compression at the preprocessing stage. We have a series of transforms which improve the performance. Star transform requires a static dictionary for a certain size. To avoid the considerable complexity of conversion, we employ the ternary tree data structure that efficiently converts the words in the text to the words in the star dictionary in linear time. (2) Exact and approximate pattern matching in Burrows-Wheeler transformed (BWT) files: We proposed a method to extract the useful context information in linear time from the BWT transformed text. The auxiliary arrays obtained from BWT inverse transform brings logarithm search time. Meanwhile, approximate pattern matching can be performed based on the results of exact pattern matching to extract the possible candidate for the approximate pattern matching. Then fast verifying algorithm can be applied to those candidates which could be just small parts of the original text. We present algorithms for both k-mismatch and k-approximate pattern matching in BWT compressed text. A typical compression system based on BWT has Move-to-Front and Huffman coding stages after the transformation. We propose a novel approach to replace the Move-to-Front stage in order to extend compressed domain search capability all the way to the entropy coding stage. A modification to the Move-to-Front makes it possible to randomly access any part of the compressed text without referring to the part before the access point. (3) Modified LZW algorithm that allows random access and partial decoding for the compressed text retrieval: Although many compression algorithms provide good compression ratio and/or time complexity, LZW is the first one studied for the compressed pattern matching because of its simplicity and efficiency. Modifications on LZW algorithm provide the extra advantage for fast random access and partial decoding ability that is especially useful for text retrieval systems. Based on this algorithm, we can provide a dynamic hierarchical semantic structure for the text, so that the text search can be performed on the expected level of granularity. For example, user can choose to retrieve a single line, a paragraph, or a file, etc. that contains the keywords. More importantly, we will show that parallel encoding and decoding algorithm is trivial with the modified LZW. Both encoding and decoding can be performed with multiple processors easily and encoding and decoding process are independent with respect to the number of processors.
4

Compressed Domain Processing of MPEG Audio

Anantharaman, B 03 1900 (has links)
MPEG audio compression techniques significantly reduces the storage and transmission requirements for high quality digital audio. However, compression complicates the processing of audio in many applications. If a compressed audio signal is to be processed, a direct method would be to decode the compressed signal, process the decoded signal and re-encode it. This is computationally expensive due to the complexity of the MPEG filter bank. This thesis deals with processing of MPEG compressed audio. The main contributions of this thesis are a) Extracting wavelet coefficients in the MPEG compressed domain. b) Wavelet based pitch extraction in MPEG compressed domain. c) Time Scale Modifications of MPEG audio. d) Watermarking of MPEG audio. The research contributions starts with a technique for calculating several levels of wavelet coefficients from the output of the MPEG analysis filter bank. The technique exploits the toeplitz structure which arises when the MPEG and wavelet filter banks are represented in a matrix form, The computational complexity for extracting several levels of wavelet coefficients after decoding the compressed signal and directly from the output of the MPEG analysis filter bank are compared. The proposed technique is found to be computationally efficient for extracting higher levels of wavelet coefficients. Extracting pitch in the compressed domain becomes essential when large multimedia databases need to be indexed. For example one may be interested in listening to a particular speaker or to listen to male female audio segments in a multimedia document. For this application, pitch information is one of the very basic and important features required. Pitch is basically the time interval between two successive glottal closures. Glottal closures are accompanied by sharp transients in the speech signal which in turn gives rise to a local maxima in the wavelet coefficients. Pitch can be calculated by finding the time interval between two successive maxima in the wavelet coefficients. It is shown that the computational complexity for extracting pitch in the compressed domain is less than 7% of the uncompressed domain processing. An algorithm for extracting pitch in the compressed domain is proposed. The result of this algorithm for synthetic signals, and utterances of words by male/female is reported. In a number of important applications, one needs to modify an audio signal to render it more useful than its original. Typical applications include changing the time evolution of an audio signal (increase or decrease the rate of articulation of a speaker),or to adapt a given audio sequence to a given video sequence. In this thesis, time scale modifications are obtained in the subband domain such that when the modified subband signals are given to the MPEG synthesis filter bank, the desired time scale modification of the decoded signal is achieved. This is done by making use of sinusoidal modeling [I]. Here, each of the subband signal is modeled in terms of parameters such as amplitude phase and frequencies and are subsequently synthesised by using these parameters with Ls = k La where Ls is the length of the synthesis window , k is the time scale factor and La is the length of the analysis window. As the PCM version of the time scaled signal is not available, psychoacoustic model based bit allocation cannot be used. Hence a new bit allocation is done by using a subband coding algorithm. This method has been satisfactorily tested for time scale expansion and compression of speech and music signals. The recent growth of multimedia systems has increased the need for protecting digital media. Digital watermarking has been proposed as a method for protecting digital documents. The watermark needs to be added to the signal in such a way that it does not cause audible distortions. However the idea behind the lossy MPEC encoders is to remove or make insignificant those portions of the signal which does not affect human hearing. This renders the watermark insignificant and hence proving ownership of the signal becomes difficult when an audio signal is compressed. The existing compressed domain methods merely change the bits or the scale factors according to a key. Though simple, these methods are not robust to attacks. Further these methods require original signal to be available in the verification process. In this thesis we propose a watermarking method based on spread spectrum technique which does not require original signal during the verification process. It is also shown to be more robust than the existing methods. In our method the watermark is spread across many subband samples. Here two factors need to be considered, a) the watermark is to be embedded only in those subbands which will make the addition of the noise inaudible. b) The watermark should be added to those subbands which has sufficient bit allocation so that the watermark does not become insignificant due to lack of bit allocation. Embedding the watermark in the lower subbands would cause distortion and in the higher subbands would prove futile as the bit allocation in these subbands are practically zero. Considering a11 these factors, one can introduce noise to samples across many frames corresponding to subbands 4 to 8. In the verification process, it is sufficient to have the key/code and the possibly attacked signal. This method has been satisfactorily tested for robustness to scalefactor, LSB change and MPEG decoding and re-encoding.
5

壓縮空間上非擬真視訊之製作

許富量, Hsu,Fu-Liang Unknown Date (has links)
NPR(Non-photorealistic Rendering)主要目的是透過不同的演算法,由電腦自動產生各種不同繪畫風格的影像,目前的NPR系統礙於演算法的計算速度,多數都僅針對靜止的單一圖片進行處理,故本研究試圖對二維空間中已發展的NPR演算法做延伸,在空間領域以及MPEG壓縮領域上分別提出不同的加速效能方式。在空間方面針對不同的範圍套用NPR演算法,如臉部、膚色區塊等有意義的部份;而在MPEG壓縮格式上,透過MPEG中的I,P,B-frame不同的特性,視影像中的差異度做不同的套用方式,以求改進NPR演算法效能,達到即時產生NPR特效的影片或動畫,進一步應用於多媒體娛樂以及人機互動機制。 / Recently, various non-photorealistic rendering (NPR) techniques have been developed for computers to generate images of different artistic styles automatically. Due to the complexity of the algorithms, however, most NPR methods are limited to the processing of static images. It is the objective of this thesis to extend and improve existing NPR techniques to enable near real-time processing of video. The enhancement can be achieved in both spatial and compressed domains. In the spatial domain, computational complexity is reduced by applying NPR only to selective regions in the images, e.g., face or skin area. In the MPEG compressed domain, by exploiting the relationship among I, P, and B frames, different strategies can be developed to increase the efficiency of the NPR algorithm. Experimental results have demonstrated the efficacy of the proposed methods and validated the near real-time creation of NPR video effects.
6

Analyse et enrichissement de flux compressés : application à la vidéo surveillance / Compressed streams analysis and enrichment : application to video surveillance

Leny, Marc 17 December 2010 (has links)
Le développement de réseaux de vidéosurveillance, civils ou militaires, pose des défis scientifiques et technologiques en termes d’analyse et de reconnaissance des contenus des flux compressés. Dans ce contexte, les contributions de cette thèse portent sur : - une méthode de segmentation automatique des objets mobiles (piétons, véhicules, animaux …) dans le domaine compressé, - la prise en compte des différents standards de compression les plus couramment utilisés en surveillance (MPEG-2, MPEG-4 Part 2 et MPEG-4 Part 10 / H.264 AVC), - une chaîne de traitement multi-flux optimisée depuis la segmentation des objets jusqu’à leur suivi et description. Le démonstrateur réalisé a permis d’évaluer les performances des approches méthodologiques développées dans le cadre d’un outil d’aide à l’investigation, identifiant les véhicules répondant à un signalement dans des bases de données de plusieurs dizaines d’heures. En outre, appliqué à des corpus représentatifs des différentes situations de vidéosurveillance (stations de métro, carrefours, surveillance de zones en milieu rural ou de frontières ...), le système a permis d’obtenir les résultats suivants : - analyse de 14 flux MPEG-2, 8 flux MPEG-4 Part 2 ou 3 flux AVC en temps réel sur un coeur à 2.66 GHZ (vidéo 720x576, 25 images par seconde), - taux de détection des véhicules de 100% sur la durée des séquences de surveillance de trafic, avec un taux de détection image par image proche des 95%, - segmentation de chaque objet sur 80 à 150% de sa surface (sous ou sur-segmentation liée au domaine compressé). Ces recherches ont fait l’objet du dépôt de 9 brevets liés à des nouveaux services et applications rendus opérationnels grâce aux approches mises en oeuvre. Citons entre autres des outils pour la protection inégale aux erreurs, la cryptographie visuelle, la vérification d’intégrité par tatouage ou l’enfouissement par stéganographie / The increasing deployment of civil and military videosurveillance networks brings both scientific and technological challenges regarding analysis and content recognition over compressed streams. In this context, the contributions of this thesis focus on: - an autonomous method to segment in the compressed domain mobile objects (pedestrians, vehicles, animals …), - the coverage of the various compression standards commonly used in surveillance (MPEG-2, MPEG-4 Part 2, MPEG-4 Part 10 / H.264 AVC), - an optimised multi-stream processing chain from the objects segmentation up to their tracking and description. The developed demonstrator made it possible to bench the performances of the methodological approaches chosen for a tool dedicated to help investigations. It identifies vehicles from a witness description in databases of tens of hours of video. Moreover, while dealing with corpus covering the different kind of content expected from surveillance (subway stations, crossroads, areas in countryside or border surveillance …), the system provided the following results: - simultaneous real time analysis of up to 14 MPEG-2 streams, 8 MPEG-4 Part 2 streams or 3 AVC streams on a single core (2.66 GHz; 720x576 video, 25 fps), - 100% vehicles detected over the length of traffic surveillance footages, with a image per image detection near 95%, - a segmentation spreading over 80 to 150% of the object area (under or over-segmentation linked with the compressed domain). These researches led to 9 patents linked with new services and applications that were made possible thanks to the suggested approaches. Among these lie tools for Unequal Error Protection, Visual Cryptography, Watermarking or Steganography

Page generated in 0.0552 seconds